NONINFORMATIVE  PRIORS  AND  BAYESIAN  INFERENCE 


By 

YEONG-HWA  KIM 


A DISSERTATION  PRESENTED  TO  THE  GRADUATE  SCHOOL 
OF  THE  UNIVERSITY  OF  FLORIDA  IN  PARTIAL  FULFILLMENT 
OF  THE  REQUIREMENTS  FOR  THE  DEGREE  OF 
DOCTOR  OF  PHILOSOPHY 

UNIVERSITY  OF  FLORIDA 


2000 


Copyright  2000 
by 

Yeong-Hwa  Kim 


Dedicated  to  God  for  giving  me  strength, 
my  parents, 

my  brother  (Yeong-Taeg), 
my  wife  (Mee-Jung), 

my  daughter  (Ha-Na),  and  my  son  (Tae-Meen) 
with  all  my  love  from  Jesus. 


ACKNOWLEDGEMENTS 


Like  many  things  in  life,  obtaining  a Ph.D.  degree  is  a task  that  cannot  be  ac- 
complished singlehandedly.  I am  grateful  to  all  of  those  persons  who  were  directly  or 
indirectly  responsible  in  the  production  of  this  dissertation. 

My  advisor,  the  Distinguished  Professor  Malay  Ghosh,  who  served  as  the  Ph.D. 
committee  chair,  is  to  be  thanked  for  his  guidance,  mentoring,  and  cordial  attention 
over  the  years.  He  lent  a great  deal  of  support  and  encouragement,  and  kept  a watch- 
ful eye  over  me  through  the  graduate  program.  Professors  Ramon  Littell,  Beverly 
Brechner,  James  Booth  and  James  Hobert,  who  compose  the  remainder  of  the  Ph.D. 
committee,  are  also  to  be  thanked  for  their  expert  advice  and  guidance.  I also  would 
like  to  give  my  special  thanks  to  Dr.  Ronald  Randles,  Chairman  of  the  Department 
of  Statistics,  who  supported  me  financially  throughout  my  years  at  the  University  of 
Florida. 

I express  heartfelt  gratitude  to  my  parents,  brother,  and  sister  for  their  never- 
ending  support,  encouragement,  prayer  and  love  throughout  all  my  endeavors.  I will 
forever  be  grateful  to  them  for  their  encouragement  to  continue  my  study  and  for  the 
dreams  they  hold  for  me. 

Finally,  but  most  importantly,  special  thanks  to  my  wife,  Mee-Jung.  I could  not 
have  completed  this  work  without  her  unwavering  love,  patience,  and  prayer.  Even 
when  I did  not,  she  always  had  confidence  in  the  Lord,  that  I would  complete  this 
dissertation.  I also  would  like  to  express  my  enormous  love  to  my  daughter,  Ha-Na, 
and  my  son,  Tae-Meen. 


IV 


TABLE  OF  CONTENTS 


ACKNOWLEDGEMENTS  iv 

LIST  OF  TABLES vi 

LIST  OF  FIGURES vii 

ABSTRACT  viii 

CHAPTERS 

1 INTRODUCTION  1 

1.1  Literature  Review 1 

1.2  The  Subject  of  This  Dissertation 10 

2 THE  BEHRENS-FISHER  PROBLEM  12 

2.1  Introduction 12 

2.2  Development  of  Noninformative  Priors 16 

2.2.1  Probability  Matching  Priors 16 

2.2.2  HPD  Matching  Prior  20 

2.2.3  Matching  via  Inversion  of  Posterior  Bartlett  Corrected 

Likelihood 

Ratio  Statistics 21 

2.2.4  Jeffreys’  General  Rule  Prior 21 

2.2.5  Reference  Priors 22 

2.3  Propriety  of  Posteriors  and  Simulation  Study 26 

2.4  Conditional  Frequentist  Properties 32 

2.5  Concluding  Remarks  33 

3 INTERVAL  ESTIMATION  OF  THE  COMMON  MEAN  OF  SEV- 

ERAL NORMAL  POPULATIONS 35 

3.1  Introduction 35 

3.2  Noninformative  Priors 37 

3.2.1  Jeffreys’  Prior  37 

3.2.2  Reference  Priors 38 

3.2.3  Quantile  Matching  Priors 39 

3.2.4  HPD  Matching  Prior  40 


V 


3.2.5  Matching  via  Inversion  of  Posterior  Bartlett  Corrected 
Likelihood 

Ratio  Statistics 41 

3.3  Propriety  of  Posteriors  41 

3.4  Simulation  Study 43 

3.5  Comparison  with  Other  Frequentist  Methods 44 

3.5.1  Confidence  Interval  for  /x  Based  on  T^-’s  46 

3.5.2  Confidence  Interval  for  /x  Based  on  Fj’s 47 

3.5.3  Confidence  Regions  for  /x  Based  on  Pj’s 48 

3.5.4  Data  Analysis 50 

3.6  Concluding  Remarks  53 

4 BAYESIAN  INFERENCE  FOR  THE  RATIOS  OF  REGRESSION 

COEFFICIENTS  IN  A LINEAR  MODEL 55 

4.1  Introduction 55 

4.2  Some  Special  Examples 57 

4.2.1  The  Linear  Calibration  Problem 57 

4.2.2  The  Fieller-Creasy  Problem 58 

4.2.3  Parallel-Line  Bioassay  Problem 59 

4.2.4  Slope-Ratio  Bioassay  Problem 60 

4.3  The  Orthogonal  Transformation 62 

4.4  Development  of  Noninformative  Priors 64 

4.4.1  Jeffreys’  Prior  64 

4.4.2  Reference  Priors 64 

4.4.3  Probability  Matching  Priors 66 

4.5  Propriety  of  Posteriors  68 

4.6  Numerical  Examples 72 

4.7  Appendix 74 

5 SUMMARY  AND  FUTURE  RESEARCH  83 

5.1  Summary 83 

5.2  Future  Research 84 

BIOGRAPHICAL  SKETCH 93 


VI 


LIST  OF  TABLES 


Table  page 

2.1  Catalog  of  reference  priors  for  Behrens-Fisher  Problem 25 

2.2  Frequentist  probabilities  of  0.05  (0.95)  posterior  quantiles  of  9i  when 

fii  = 2.0, /U2  = 0.0  and  cr|  = 1.0  {rii  > n^) 30 

2.3  Frequentist  coverage  probabilities  of  0.05  (0.95)  posterior  quantiles  of 

6i  when  = 2.0, 112  = 0.0  and  a\  = 1.0  (ni  < n2) 31 

3.1  Frequentist  coverage  probabilities  of  0.05  (0.95)  posterior  quantiles  of 

fx  when  = 0.0  and  = 1.0  (ni  > rz2) 45 

3.2  Frequentist  coverage  probabilities  of  0.05  (0.95)  posterior  quantiles  of 

jjL  when  fi  = 0.0  and  cr|  = 1.0  (ni  < 712) 45 

3.3  Percentage  of  albumin  in  plasma  protein 51 

3.4  Interval  Estimates  for  /r 51 

3.5  Selenium  in  non-fat  milk  powder 52 

3.6  Interval  Estimates  for  /r 53 

4.1  Catalog  of  the  elements  of  C 71 

4.2  Responses  in  an  assay  of  vitamin  72 

4.3  Posterior  quantiles 73 

4.4  Responses  in  an  assay  of  riboflavin  in  malt  73 

4.5  Posterior  quantiles 74 


vii 


LIST  OF  FIGURES 


Figure  page 

3.1  Posterior  pdf  of  // 52 

3.2  Posterior  pdf  of  // 54 


viii 


Abstract  of  Dissertation  Presented  to  the  Graduate  School 
of  the  University  of  Florida  in  Partial  Fulfillment  of  the 
Requirements  for  the  Degree  of  Doctor  of  Philosophy 


NONINFORMATIVE  PRIORS  AND  BAYESIAN  INFERENCE 


By 

Yeong-Hwa  Kim 
August  2000 

Chairman:  Malay  Ghosh 
Major  Department:  Statistics 

The  prime  focus  of  this  dissertation  is  Bayesian  analysis  of  some  classical  problems 
of  statistical  inference.  In  particular,  we  have  derived  several  “default”  priors  such 
as  Jeffreys’  prior,  reference  priors,  and  probability  matching  priors  for  the  Behrens- 
Fisher  problem,  estimation  of  the  common  mean  of  several  normal  populations,  and 
for  the  ratios  of  regression  coefficients  in  a linear  model.  The  latter  includes,  as 
special  causes,  the  linear  calibration  problem,  the  Fieller-Creasy  problem,  slope-ratio 
bioassay,  and  parallel-line  bioassay. 

For  the  Behrens-Fisher  problem,  we  find  an  orthogonal  transformation  in  the 
sense  of  Cox  and  Reid  and  propose  an  alternative  prior  such  that  the  coverage  prob- 
ability of  the  resulting  credible  interval  for  the  difference  of  the  two  means  matches 
asymptotically  the  corresponding  frequentist  coverage  probability  more  accurately 
than  Jeffreys’  prior.  Though  the  result  holds  only  asymptotically,  our  simulation 
study  indicates  that  second  order  probability  matching  prior  performs  very  well  in 
terms  of  matching  the  target  coverage  probabilities  in  a frequentist  sense,  especially 


IX 


for  small  and  moderate  sample  sizes.  The  prior  is  also  justified  from  the  conditional 
frequentist  perspective. 

For  the  interval  estimation  of  the  common  mean  of  several  normal  populations, 
Jeffreys’  prior,  reference  priors,  a class  of  first  order  probability  matching  priors  and 
a second  order  probability  matching  prior  are  derived.  Though  asymptotic,  our  simu- 
lation results  indicate  that  the  second  order  probability  matching  prior  meets  the  fre- 
quentist target  coverage  probabilities  better  than  reference  prior  especially  for  small 
and  moderate  sample  sizes.  Several  confidence  intervals  that  are  proposed  from  a 
frequentist  point  of  view  are  compared  against  the  Bayesian  method  by  analyzing 
two  real  examples.  The  lengths  of  the  Bayesian  intervals  are  nearly  equivalent  to  the 
shortest  frequentist  intervals  under  both  circumstances.  Also,  in  this  case,  it  turns 
out  that  among  the  frequentist  procedures,  there  is  no  clear-cut  winner. 

For  the  ratios  of  regression  coefficients  in  a linear  model,  we  find  an  orthogonal 
transformation  and  derive  Jeffreys’  prior,  reference  priors  and  a class  of  first  order 
probability  matching  priors.  A second  order  probability  matching  prior  is  also  derived 
and  proprieties  of  posterior  distributions  are  discussed.  Several  real  examples  are 
analyzed. 


X 


CHAPTER  1 
INTRODUCTION 


In  this  chapter  we  discuss  some  pertinent  literature  related  to  the  development  of 
“default”  or  “noninformative”  priors.  Also,  we  introduce  the  topics  of  this  disserta- 
tion. 


1.1  Literature  Review 

Bayesian  techniques  have  found  wider  acceptance  in  recent  years  in  the  theory 
and  practice  of  statistics.  These  can  be  partly  explained  by  the  fact  that  even  in 
the  presence  of  little  or  vague  prior  information,  Bayesian  techniques  can  often  be 
used  successfully  by  employing  the  so-called  “diffuse” , “vague”  or  “noninformative” 
priors.  Thus,  not  surprisingly,  over  the  years,  a wide  range  of  noninformative  priors 
has  been  proposed  and  studied. 

When  prior  distributions  have  no  population  basis,  they  can  be  difficult  to  con- 
struct, and  there  has  long  been  a desire  for  prior  distributions  that  can  be  guaranteed 
to  play  a minimal  role  in  the  posterior  distribution.  Such  priors  are  often  called  vague, 
diffuse  or  noninformative  priors.  The  rationale  for  using  noninformative  priors  is  of- 
ten said  to  be  “let  the  data  speak  for  themselves”  so  that  inferences  are  unaffected 
by  current  prior. 

The  earliest  use  of  noninformative  priors  is  attributed  to  Laplace  (1812)  who  rec- 
ommended using  a flat  prior  over  the  entire  parameter  space.  However,  a uniform 
prior  lacks  invariance  under  reparameterization  since  a uniform  distribution  under 
one  parameterization  will  not  yield,  on  transformation,  another  uniform  distribution 
unless  the  transformation  is  linear.  For  example,  a uniform  prior  for  the  standard 


1 


2 


deviation  a will  not  transform  into  a uniform  prior  for  the  variance  cr^.  This  lack 
of  invariance  of  the  uniform  prior  often  translates  into  considerable  variation  in  the 
resulting  posteriors.  To  overcome  this  difficulty,  Jeffreys  (1961)  proposed  the  prior 
proportional  to  the  positive  square  root  of  the  determinant  of  the  Fisher  information 
matrix,  which  remains  invariant  under  any  one-to-one  parameterization.  Despite  its 
success  in  one-parameter  problems,  Jeffreys’  prior  often  runs  into  serious  difficulties 
in  multiparameter  problems  when  only  a subset  of  one  or  more  parametric  function  (s) 
of  the  parameter  vector  6{p  x 1)  is  of  inferential  interest  and  the  remaining  param- 
eters are  nuisance  parameters.  For  example,  Berger  and  Bernardo  (1992a)  showed 
that  Jeffreys’  prior  can  lead  to  an  inconsistent  estimator  of  the  error  variance  in  the 
balanced  one-way  normal  ANOVA  model  when  the  number  of  unknown  treatment 
parameters  goes  to  infinity  in  direct  proportion  to  sample  size.  Here  Jeffreys’  prior 
fails  to  avoid  the  Neyman-Scott  (1948)  phenomenon. 

To  overcome  these  deficiencies  of  Jeffreys’  prior,  Bernardo  (1979)  introduced  the 
so-called  “reference  prior”  for  multiparameter  problems  by  dividing  the  parameter 
vector  into  parameters  of  interest  and  nuisance  parameters.  This  idea  was  further 
extended  and  generalized  by  Berger  and  Bernardo  (1989,  1992a,b)  by  splitting  the 
parameter  vector  into  two  or  more  groups  according  to  their  order  of  importance. 
As  pointed  out  by  these  authors,  Jeffreys’  prior  is  also  a reference  prior  when  all  the 
parameters  are  treated  as  equally  important. 

The  idea  behind  the  development  of  a reference  prior  is  as  follows:  For  an  experi- 
ment with  density  p{x\0)  and  prior  density  n{6),  consider  the  amount  of  information 
about  0 that  the  experiment  can  be  expected  to  provide.  Writing  'k{0\x)  as  the  poste- 
rior of  0 given  a;,  Bernardo  (1979)  argues  for  using,  as  a measure  of  this  information, 

{p,-n}  = ! ! p{x\0)'K{0)\og'^^^^d0dx  . 


3 


The  reference  prior  is  the  tt  that  maximizes  this  quantity,  the  rationale  being 
that  the  larger  this  information  is,  the  less  informative  is  the  prior.  For  a variety 
of  technical  reasons,  the  reference  prior  is  actually  defined,  not  for  the  experiment 
p(cc|0),  but  via  an  asymptotic  limit  of  iid  repetitions  of  the  experiment.  In  situations 
where  asymptotic  normality  of  the  posterior  holds,  Bernardo  (1979)  showed  that  the 
reference  prior  for  0,  provided  there  are  no  nuisance  parameters,  is  Jeffreys’  (1961) 
prior  7t(0)  = |/(0)|^/^.  Here  I{6)  is  the  Fisher  information  matrix  and  |A|  denotes 
the  determinant  of  a matrix  A.  Such,  however,  is  not  the  case  in  the  presence  of 
nuisance  parameters. 

While  Bernardo  (1979)  considered  only  two-group  reference  priors  by  splitting 
the  parameters  into  two  groups  - parameters  of  interest  and  nuisance  parameters  - 
Berger  and  Bernardo  (1992b)  considered  the  more  general  case  of  m-group  reference 
priors.  We  will  follow  their  notations  to  introduce  the  algorithm  of  the  general  m- 
group  reference  priors  in  a particular  set  up  which  covers  most  examples  that  are 
available  in  the  literature. 


Consider  a parametric  statistical  problem  in  which  the  random  observation  X has 
density  p{x\9),  where  0 G 0 C 7?.*’  is  the  unknown  parameter.  We  assume  that  the 
Fisher  information  matrix  I{6)  = [{ ae%e'  exists  and  has  rank  k, 

so  that  it  is  invertible. 

We  assume  that  the  6i  are  separated  into  m groups  of  sizes  ni, ...,  , and  that 

these  groups  are  given  by 


^(1)  (^1  > •••)  ^ni ) ) ^(2)  (^ni+1)  •••)  ^ni+n2))  •••) 

^(i)  = (^Wi_i+1)  •••,  •••;  ^(m)  = 

where  Nj  = ni  + ...  + rij  for  j = 1, ...,  m.  Also,  we  define,  for  j = 1, ...,  m , 


= (^(1)’  %))’  = (%+!)>  ^m) 


4 


with  0[m]  — d and  0[^m]  = (f>,  the  empty  set.  Here  0(i)  is  the  parameter  of  interest. 
For  the  construction  of  reference  priors,  often  the  ordering  of  the  remaining  groups 
does  not  matter. 

Let  0*  C 0,  / = 1, 2, ...,  denote  the  compact  subsets  to  be  chosen  for  6 such  that 

00 

U 0'  = 0 . 

/=! 

For  j = 0, 1, ...,  m — 1,  let 

^ for  some  ^Hj+i)]}  • 

Also,  let  !(?/  G n)  denote  the  indicator  function  that  equals  1 if  ?/  G 0 and  0 otherwise. 


Specifically,  one  follows  an  algorithm  of  Berger  and  Bernardo  (1992b).  Towards 
this,  first  define  \hj{6)\  following  Datta  and  Ghosh  (1996)  as 


hm 


j = l,...,m 


where 


i=j+l,...,m,k=j+l, 


,m 


The  algorithm  is  now  described  as  follows  : 
Start.  Define 


|/i.„(g)yu  (e(„)  € e'(9|„-i])) 


Iteration.  For  j = m — l,m  — 2, ...,  1,  define 


(1.1) 


7rj 


7rj+i  1 

1 exp  1 

(iog|h,(9)|)|em]} 

1 (0y)  G 0'(0[J_1])) 

^O)€0'(0o_ 

{lEi 

[(log  Ih, (0)1)101,]] 

} 

5 


where 

E]  [g{0)\6ij])  = j ^(0)7rj_^i  {O[~i]\0\j^  c?^H]  • (1-3) 

Note  that  it  is  easy  to  check,  by  integrating  in  succesion  over  0(m))  •••)  0(j)-, 

that  7rj  defines  a probability  distribution.  For  j = 1,  interpret  0[^o]  as  9,  0[o]  as 
vacuous,  and  write 

tv\9)  ^ n[  (^9[^o]\0[o])  . (1.4) 

Finish.  Define  the  m-group  reference  prior  by 

= ,‘i'S 

where  0*  is  some  point  in  0^. 


Note  that  if  the  integrals  and  expectation  in  (1.1)  and  (1.2)  are  finite  when  the 
7’  is  removed,  that  is,  when  0*  is  replaced  by  0 everywhere,  then  the  reference  prior 
is  defined  simply  by  tti,  so  that  (1.5)  is  not  needed. 


In  particular,  suppose  that  \hj{9)\  depends  only  on  0^],  j = l,...,m.  Then,  the 
m-group  reference  prior  is  given  by 


n‘(0) 


1 (e  € e‘)  . 


Also  Datta  and  M. Ghosh  (1995)  simplified  the  calculation  of  the  reference  priors 
under  some  conditions.  Assume  that  the  Fisher  information  matrix  I{0)  of  0 is 

I{0)  = block  diagonal(fii(0), ..., /im(0)), 

where  hj{0)  is  njXUj  (may  not  be  diagonal).  Let  0^)  = {0(i),  0{j-i),  0(j+i)^  •••»  0(m)), 

j = 1,  ...,m.  Assume  that 


^j(^)l  “ ^jl{^U))^j2{0{j)) 


6 


for  nonnegative  functions  hji  and  hj2.  Then, 

m 

<e)  = n 

j=l 

A somewhat  different  criterion  for  developing  noninformative  priors  is  based  on 
matching  the  posterior  coverage  probability  of  Bayesian  credible  intervals  for  a real- 
valued parameter  of  interest  with  the  corresponding  frequentist  coverage  probability. 
Such  matching  is  accomplished  through  either  (a)  posterior  quantiles,  (b)  the  highest 
posterior  density  (HPD)  regions,  or  (c)  the  inversion  of  likelihood  ratio  and  related 
statistics.  In  many  literatures,  (a),  (b)  and  (c)  have  received  attention  in  decreasing 
order  of  importance. 

We  shall  now  discuss  these  matching  concepts  in  greater  details.  We  begin  with 
the  discussion  of  (a).  Let  {Xi,i  >1}  be  a sequence  of  independent  and  identically 
distributed  possibly  vector- valued  random  variables  with  common  density  f{x;0), 
where  6 = {9i,  ...,6p)'^  belongs  to  some  open  subset  of  TZ^,  and  9i  is  the  param- 
eter of  interest.  Let  X = (Xi, ...,  X„)^,  where  n is  the  sample  size,  and  denote 
by  P'^{-\X}  the  posterior  probability  measure  for  9 under  a prior  7t(0).  Suppose 
9^i~°‘\7t,X)  denotes  an  approximate  (1  — 0!)th  posterior  quantile  which  satisfies 
P^{9i  < X)1X}  = 1 — a -f  o{n~^).  We  seek  to  characterize  the  class 

of  priors  7t(0)  such  that  P{9i  < X)|0}  = 1 — a + o{n~^).  Priors  tt  satisfying 

this  property  will  be  called  first  order  probability  matching  priors. 

A brief  introduction  to  matching  priors  is  given  in  Lindley  (1958),  but  their  sys- 
tematic study  began  with  Welch  and  Peers  (1963)  who  considered  development  of 
such  priors  in  the  absence  of  nuisance  parameters,  for  p = 1.  These  authors  showed 
in  this  case  that  the  unique  prior  satisfying  the  first  order  matching  property  is  Jef- 
freys’ prior.  Peers  (1965)  followed  up  this  study  by  including  nuisance  parameters. 
Writing  /[=  I (9)]  = ((Jq))  as  the  Fisher  information  matrix  per  unit  observation 


7 


and  I ^[=  I ^(0)]  = ((7®-')),  he  characterized  the  class  of  priors  tt  satisfying  the  first 
order  matching  property  as  solutions  of  the  partial  differential  equation 

A,{n,9)  = = 0 . (1.6) 

j=l 

We  also  refer  to  Stein  (1985)  in  this  context.  For  p > 2,  one  may  be  interested  in 
priors  that  are  first  order  probability  matching  for  each  component  of  6.  Starting 
from  equations  like  (1.6)  separately  for  each  9i,...,dp,  Peers  (1965)  gave  necessary 
and  sufficient  conditions  on  the  model  for  the  existence  of  such  priors. 

As  it  turns  out,  Jeffreys’  prior  oc  \I{9)\^  need  not  always  satisfy  (1.6)  even 
though  other  priors  satisfying  (1.6)  could  exist.  An  important  simplification  of  (1.6)  is 
due  to  Tibshirani  (1989)  in  the  case  when  di  is  orthogonal  to  the  nuisance  parameters 
{92,...,0p),  that  is,  when  = 0,  identically  in  0(2  < j < p).  In  this  case,  the  class 
of  first  order  probability  matching  priors  7t(0)  is  characterized  by 

m = l'n9{l>2,-,9p),  (1.7) 

where  g is  an  arbitrary  positive  function. 

The  idea  of  matching  was  pursued  further  in  later  papers  by  Nicolau  (1993)  and 
Datta  and  M. Ghosh  (1995).  As  it  is  easily  seen,  there  are  infinitely  many  first  order 
probability  matching  priors  in  the  presence  of  nuisance  parameters.  With  the  intent 
of  narrowing  down  the  selection  of  priors,  Mukerjee  and  Dey  (1993)  considered  priors 
with  Bayesian  credible  intervals  and  frequentist  confidence  intervals  matching  up  to 
o(n~^),  instead  of  o{n~^),  for  the  case  of  single  nuisance  parameter.  These  results  were 
later  generalized  by  Mukerjee  and  Ghosh  (1997)  in  the  presence  of  several  nuisance 
parameters.  The  findings  of  these  authors  indicate  that  second  order  probability 
matching  priors  need  not  always  exist,  while  sometimes  all  first  order  probability 
matching  priors  also  meet  the  second  order  matching  criterion.  Barring  these  extreme 


8 


situations,  second  order  matching  priors  help  to  narrow  down  the  selection  of  priors 
and  indeed  often  lead  to  a unique  prior  within  the  class  of  first  order  probability 
matching  priors.  In  order  to  develop  such  priors,  let  and  Lj^s  — 

EQ{d^\ogf{Xi\6)/d9jd9rd9s}il  < j,r,s  < p . Then  Mukerjee  and  Ghosh  (1997) 
showed  that  a prior  tt  is  second  order  probability  matching  based  on  the  posterior 
quantiles  if  and  only  if  it  satisfies  (1.6)  and,  in  addition,  satisfies  a second  partial 
differential  equation  A2{it,0)  = 0,  where 

- 5 E E E E - 2r“M0)}  . (1.8) 

v=l  j=l  r=l  s=l 

Under  parametric  orthogonality,  (1.8)  simplifies  to 


U(«2,  + E E = 0 , (1.9) 


d9i 


v=2  8=2 


where  = E0[{d\ogf{Xi,6)/d9iY].  With  only  one  nuisance  parameter,  (1.9) 
reduces  to 

J9(«2)  J^(/ry"ii,i,i)  + = 0,  (1.10) 

which  is  due  to  Mukerjee  and  Dey  (1993).  For  p = 1,  that  is  when  there  are  no  nui- 
sance parameters,  it  follows  from  (1.9)  that  Jeffreys’  prior  is  second  order  probability 
matching  if  and  only  if  is  free  from  9i. 

Probability  matching  priors  are  invariant  under  parametric  transformation  (cf. 
Datta  and  Ghosh,  1996  ; Mukerjee  and  Ghosh,  1997),  while  reference  priors  are 
invariant  only  under  parametric  transformation  within  each  group. 

Probability  matching  priors  for  specific  problems  are  investigated  by  a few  au- 
thors. Ghosh,  Carlin  and  Srivastava  (1995)  derived  and  studied  a class  of  first  order 


9 


probability  matching  priors  and  a complete  catalog  of  reference  priors  of  the  univari- 
ate linear  calibration  problem.  Ghosh  and  Yang  (1996)  derived  a class  of  first  order 
probability  matching  priors  for  two  sample  normal  problem.  Sun  and  Ye  (1996)  con- 
sidered a two-parameter  exponential  family  with  first  and  second  order  probability 
matching  priors.  Garvan  and  Ghosh  (1997)  studied  probability  matching  and  refer- 
ence priors  in  the  more  general  case  of  dispersion  models  of  Jprgensen  (1992). 

Next,  we  discuss  the  criteria  (b)  and  (c). 

DiGiccio  and  Stern  (1994)  and  Ghosh  and  Mukerjee  (1995)  discussed  matching 
through  the  HPD  region.  Specifically,  if  ^ denotes  the  posterior  distribution  of  9i 
under  a prior  tt,  and  ka  = ka{n,  X)  is  such  that 


^{di\X)  > ka 


= 1 — a -I-  o{n  ^) 


(1.11) 


then  the  HPD  region  for  9i  with  posterior  coverage  probability  1 — o:  + o(n  ^)  is  given 
by 

Main,  X)  = {01  : 7f(0i, ...,  9p\X)  > ka}  • (1.12) 


DiCicco  and  Stern  (1994)  and  Ghosh  and  Mukerjee  (1995)  characterized  priors  tt  for 
which 


01  ^ 77q(7t,  JY)  01, ...,  0p 


= 1 — a + o{n  ) 


(1.13) 


for  all  0i,...,0p  and  all  a G (0,1).  They  found  necessary  and  sufficient  conditions 
under  which  tt  satisfies  (1.13). 

Yet  a third  criterion  of  matching  is  via  inversion  of  posterior  Bartlett  corrected 
likelihood  ratio  statistics.  This  is  investigated  in  Ghosh  and  Mukerjee  (1991)  and  in 
DiGiccio  and  Stern  (1994).  Specifically,  these  authors  have  found  priors  under  which 
the  Bayesian  and  frequentist  Bartlett  corrections  for  the  likelihood  ratio  test  statistic 


10 


differ  by  o(l)  when  the  sample  size  tends  to  infinity.  Similar  results  are  found  in 
Ghosh  and  Mukerjee  (1992)  for  conditional  likelihood  ratio  test  statistics. 

As  mentioned  in  Ghosh  (1994,  p 86),  there  are  usually  four  criteria  associated  with 
the  development  of  noninformative  priors.  These  are  (i)  maximization  of  entropy  or 
minimization  of  information;  (ii)  matching  asymptotically  the  coverage  probabilities 
of  Bayesian  credible  sets  with  the  corresponding  frequentist  probabilities;  (iii)  prin- 
ciple of  group  invariance;  and  (iv)  minimaxity  of  Bayesian  procedures.  Of  these,  (i) 
and  (ii)  have  found  the  widest  applicability  in  the  Bayesian  literature.  The  reference 
priors  are  very  satisfactory  under  criterion  (i),  while  probability  matching  priors,  as 
the  name  implies,  are  ideal  for  (ii). 

The  utility  of  noninformative  priors  has  always  been  questioned  by  subjectivists. 
Yet  one  cannot  deny  their  pragmatic  appeal.  Indeed,  the  wider  acceptance  of  Bayesian 
techniques  in  recent  years  both  in  the  theory  and  in  the  practice  of  statistics  can  be 
partly  attributed  to  the  fact  that  even  with  little  or  vague  prior  information,  Bayesian 
techniques  can  often  be  used  successfully  by  employing  noninformative  priors.  Reid 
(1995)  and  Kass  and  Wasserman  (1996)  contain  excellent  reviews  of  noninformative 
priors. 


1.2  The  Subject  of  This  Dissertation 

In  Chapter  2,  we  consider  the  Behrens-Fisher  problem  which  involves  inference 
about  the  difference  of  two  normal  means  when  the  ratio  of  the  two  variances  is 
unknown.  The  class  of  second  order  probability  matching  priors,  the  reference  priors, 
and  the  Jeffreys’  general  rule  prior  are  derived  and  are  compared  with  the  Jeffreys’ 
independent  prior  via  simulation  study.  One  particular  second  order  probability 
matching  prior  is  recommended  for  use.  It  turns  out  that  this  second  order  probability 
matching  prior  satisfies  the  matching  criteria  (b)  and  (c).  Our  simulation  study 


11 


indicates  that  second  order  probability  matching  priors  perform  better  than  the  other 
priors  in  terms  of  target  coverage  probabilities  in  a frequentist  sense  even  for  small 
and  moderate  sample  sizes. 

Chapter  3 considers  the  interval  estimation  of  the  common  mean  of  several  normal 
populations.  The  class  of  first  order  probability  matching  priors,  the  reference  priors, 
Jeffreys’  prior,  and  a second  order  probability  matching  prior  are  derived.  It  turns  out 
that  the  second  order  probability  matching  prior  also  satisties  the  matching  criteria 
(b)  and  (c).  Equal  tailed  and  HPD  credible  intervals  are  compared  against  several 
frequentist  confidence  intervals. 

Chapter  4 is  devoted  to  Bayesian  inference  involving  ratios  of  regression  coeffi- 
cients. A nontrivial  orthogonal  reparameterization  is  introduced  which  facilitates  the 
development  of  default  Bayesian  priors  such  as  the  reference  priors  and  probability 
matching  priors  in  the  general  case  of  multiple  regression  models.  As  special  cases,  we 
have  considered  the  Fieller-Creasy  problem,  linear  calibration  problem,  parallel-line 
bioassay  and  slope-ratio  bioassay. 

Finally,  in  Chapter  5,  we  summarize  the  result  of  this  dissertation  and  propose 


several  topics  for  future  research. 


CHAPTER  2 

THE  BEHRENS-FISHER  PROBLEM 
2.1  Introduction 


The  Behrens-Fisher  problem  is  that  of  testing  whether  the  means  of  two  normal 
populations  are  equal  without  necessarily  assuming  equality  of  the  variances.  An 
essentially  equivalent  problem  is  that  of  finding  a confidence  interval  for  the  difference 
of  two  normal  means. 

Specifically,  suppose  that  two  independent  random  samples  of  sizes  rii{>  2)  and 
ri2(>  2)  are  drawn  from  two  normal  populations  with  respective  means  /xi  and  /X2, 
and  respective  variances  and  cr|.  The  corresponding  sample  means  are  denoted  by 
Xi  and  X2,  and  sample  variances  with  denominators  ni  — 1 and  ri2  — 1 are  denoted 
by  Sf  and  S^- 

If  the  variance  ratio  aj/a^  = t]  is  assumed  to  be  known  (and,  in  particular,  when 
rj  — 1),  then  the  pivot 

X — (^1  - -^2)  - (mi  ~ fj'2)  f rii+n2  - 2\  ^ 

~ [(ni  - l)Sfr]-^  + (ri2  - 1)5|]^  ^ V 


has  a t-distribution  with  rti  + ri2  — 2 degrees  of  freedom  yielding  the  100(1  — a)% 
confidence  interval 


X,  - Xo  ± t 


a/2;ni+Ti2— 2 


(rti  - l)Slr]  ^ + (rt2  - l)g| 
ni  + ri2  - 2 


^ 1 


(7/ni 


(2.2) 


for  — fi2-  Throughout,  we  will  use  the  symbol  ta;u  for  the  upper  100(1  — a)%  point 
of  Student’s  t with  u degrees  of  freedom.  The  Behrens-Fisher  problem  arises  when 
the  variance  ratio  crl/cr^  is  unknown. 


12 


13 


The  above  inference  problem  has  received  considerable  attention  for  several  decades. 
Here  the  fiducial  interval  of  Fisher  differs  drastically  from  the  Neyman-Pearson  con- 
fidence interval.  This  issue  is  discussed  extensively  in  Kendall  and  Stuart  (1967). 
Scheffe  (1970)  and  Lee  and  Gurland  (1975)  provide  reviews  of  various  solutions  to 
the  problem. 

Behrens  (1929)  was  the  first  to  offer  a solution  to  the  Behrens-Fisher  problem  in 
a testing  context,  namely  Hq  : ni  = p2  against  Hi  \ fii  ^ H2-  Later,  Fisher  (1935) 
pointed  out  that  this  solution  could  be  justified  using  the  fiducial  theory  of  inference. 
From  a frequentist  perspective,  Bartlett  (1936)  noted  that  Behrens’  test  has  a size 
different  from  what  was  originally  intended.  Inverting  this  test  into  an  interval,  this 
amounts  to  the  fact  that  the  coverage  probability  of  the  confidence  interval  for  pi  — //2 
is  different  from  the  specified  confidence  coefficient.  Behrens’  interval  is  based  on  the 
pivot 

_ (^1  - ^2)  - (mi  - /^2) 

{ui^S^  -h  ri2^S^)y^ 

In  contrast,  Welch  (1947)  provided  a test  having  type  I error  very  close  to  the 
nominal  value  throughout  the  parameter  space.  Thus,  the  resulting  confidence  inter- 
val for  Hi  — H2  has  coverage  probability  very  nearly  equal  to  the  target  confidence 
coefficient.  Welch’s  procedure  is  also  based  on  the  pivot  D,  but  he  uses  different 
significance  points.  These  calculations  were  later  extended  by  Aspin  (1948). 

Fisher  (1956)  criticized  Welch’s  test  in  a way  that  amounts  to  showing  the  exis- 
tence of  negatively  biased  relevant  subset  in  the  sense  of  Buehler  (1959).  For  instance, 
as  shown  in  Robinson  (1982),  P[\D\  > a\Si/S2  = 1]  > P[|tm +712-2!  > 0]  for  all  a > 0. 
Thus,  the  set  where  5'i/5'2  = 1 is  a relevant  subset,  confidence  intervals  based  on  the 
Welch- Aspin  test  covers  the  true  value  of  hi  ~ M2  less  often  than  what  the  nominal 
value  suggests.  On  the  other  hand,  as  shown  by  Robinson  (1976),  negatively  biased 
relevant  subsets  do  not  exist  for  Behrens’  solution. 


14 


Jeffreys  (1961)  pointed  out  that  a Bayesian  calculation  based  on  the  so-called 
independent  prior  (^1,(^2)  oc  (cricr2)“^  yields  a credible  interval  for  fii  — 

fj,2  which  is  algebraically  equivalent  to  the  fiducial  interval  of  Fisher,  although  the 
interpretations  are  necessarily  different.  Throughout,  we  will  refer  to  this  prior  as 
Jeffreys’  independent  prior.  The  same  prior  is  mentioned  in  Cox  and  Hinkley  (1974). 
It  is  worth  pointing  out  here  that  the  above  Jeffreys’  independent  prior  is  different 
from  Jeffreys’  general  rule  prior  which  is  the  positive  square  root  of  the  determinant 
of  the  Fisher  information  matrix,  and  in  this  example,  is  proportional  to  (0-1(72)“^. 

Thus,  while  the  prior  7T‘^^(/ii, /T2,  cti,  (T2)  oc  {aia2)~^  lends  support  to  the  fiducial 
solution  (though  does  not  support  the  fiducial  reasoning)  both  from  Bayesian  and 
conditional  frequentist  perspectives,  it  is  not  clear  whether  this  prior  can  necessarily 
be  justified  from  the  usual  unconditional  frequentist  criterion.  In  Section  2 of  this 
chapter,  we  provide  some  justification  of  this  prior  from  a frequentist  angle.  More 
important,  we  find  a new  prior  in  this  section,  credible  intervals  based  on  which 
match  asymptotically  the  corresponding  frequentist  coverage  probabilities  more  ac- 
curately than  Jeffreys’  independent  prior.  Though  this  matching  can  be  justified  only 
asymptotically,  our  simulation  results  indicate  that  this  is  indeed  achieved  for  small 
or  moderate  sample  sizes  as  well.  This  prior  is  also  satisfactory  from  a conditional 
frequentist  perspective. 

In  Section  2 of  this  chapter,  we  develop  first  and  second  order  probability  matching 
priors  for  the  Behrens-Fisher  problem  when  the  parameter  of  interest  is  pi  — P2-  It 
is  shown  that  the  Jeffreys’  independent  prior  is  a first  order  probability  matching 
prior,  but  is  not  a second  order  probability  matching  prior.  An  alternate  prior  which 
is  second  order  probability  matching  prior  is  derived  in  this  section.  This  new  prior 
is  justified  also  by  some  alternate  matching  criteria  such  as  HPD  matching,  and 
matching  via  inversion  of  likelihood  ratio  test  statistics.  Also,  in  this  section,  we 
bring  in  some  alternate  criteria  for  the  development  of  noninformative  priors.  In 


15 


particular,  we  find  the  “two-group”  and  “one-at-a-time”  reference  priors  as  developed 
in  Bernardo  (1979)  and  Berger  and  Bernardo  (1989, 1992a, b).  It  turns  out  that  the 
prior  proposed  by  Jeffreys  is  one-at-a-time  reference  prior  based  on  some  particular 
ordering  of  the  parameters. 

In  Section  3,  the  propriety  of  posteriors  under  the  different  priors  developed  in 
Section  2 is  established.  Certain  other  properties  of  these  posteriors  such  as  symme- 
try and  unimodality  are  also  proved  in  this  section.  Also,  a small  simulation  study  is 
performed  which  shows  that  the  second  order  probability  matching  prior  developed 
in  this  chapter  matches  the  frequentist  target  coverage  probabilities  better  than  Jef- 
freys’ independent  prior  especially  for  small  or  moderate  samples.  Section  4 provides 
justification  of  the  former  from  a conditional  frequentist  perspective.  In  particular, 
it  is  pointed  out  that  unlike  the  Welch-Aspin  solution,  the  credible  intervals  under 
the  proposed  second  order  matching  prior,  though  satisfactory  from  a frequentist  per- 
spective, do  not  admit  negatively  biased  relevant  subsets.  Some  concluding  remarks 
are  made  in  Section  5. 

In  comparison  with  Jeffreys’  independent  prior  or  the  one-at-a-time  reference 
prior,  the  new  prior  proposed  in  this  chapter  seems  to  be  more  appropriate  under 
the  HPD  criterion,  especially  for  achieving  the  Bayes-frequentist  synthesis  for  small 
and  moderate  sample  sizes.  This  is  evidenced  in  the  numerical  findings  of  Section  4. 
Clearly,  there  are  situations  such  as  the  Fieller-Creasy  (the  ratio  of  normal  means) 
problem  where  the  Bayes-frequentist  synthesis  may  not  be  appropriate.  There  the 
frequentist  confidence  set  may  be  two  disjoint  unbounded  sets,  or  sometimes  even 
the  whole  real  line.  However,  such  difficulty  is  not  encountered  in  the  Behrens-Fisher 
problem  by  frequentist  procedures,  and  as  such,  the  proposed  Bayesian  method  does 
not  suffer  any  frequentist  shortcomings.  Our  findings  are  also  quite  in  conformity 
with  Berger  and  Bernardo  (1989)  who  recommended  evaluation  of  noninformative 
priors  by  good  frequentist  properties.  Finally,  Jeffreys’  independent  prior  and  the 


16 


newly  developed  prior  are  both  quite  satisfactory  from  a conditional  frequentist  per- 
spective. 


2.2  Development  of  Noninformative  Priors 


2.2.1  Probability  Matching  Priors 

Let  Xii, ...,  Xinj^,X2i, ...,  X2u2  be  mutually  independent  with  Xij  {j  = 1, ...,  n*;  * = 

1.2)  iidAf{fj,i,af).  We  write  Xi  = (Xn, ...,  Xa  = (X21, ...,  = Npi  {i  = 

1.2) ,  where  pi  -I-  pa  = 1-  The  parameter  of  interest  is  0i  = — P2-  For  any  given 

prior  7T,  we  denote  the  posterior  by  Xa).  The  first  objective  is  to  find  the 

class  of  first  order  probability  matching  priors  when  9i  is  the  parameter  of  interest. 
In  this  example,  this  amounts  to  finding  a tt  such  that 


pn 


9,<et^\n,X,,X2) 


Pi)  P2)  o’i,a2 


= l-a  + o(X-“) 


(2.3) 


as  X ^ 00  for  some  u > 0,  where  Xi,  Xa)  denotes  the  (1  — 0!)th  posterior 

quantile  of  9i  based  on  the  prior  tt. 

We  begin  by  noting  that  the  Fisher  information  matrix  J(pi, pa> <^i) <^■2)  is  given 


by 


N n-  jfni  H2  2ni  2na 
(/^l  ? ^1  j ^2)  iJtQQOTlCll  I 2?  0)  2 ’ 2 


^2 

Next  we  propose  the  one-to-one  reparameterization 


(^2 


91 

92 
9z 


Pi  - P2 

nicrfVi  n2(T2^iJ,2 
nio-f  ^ + n2(J2^ 

CTl 


9i  = 02 


17 


which  transforms  the  Fisher  information  matrix  to 

(6>|/ni +6>|/n2)“^  0^ '' 


I{0) 


where  0 = (^i,  ^2,  ^3,  ^4)  and 


[22 


2ffi 

83(6'i/ni+9l/n2) 

2gi 

^4(S|/ni+e2/n2) 


_j_  

^ nin2(6j/ni+el/n2)^ 


nin2{6l/ni+$l/n2)^ 


0 


S3(fl3/m+9|/n2) 


[22 


2gi 

e4(S|/ni+^4/n2) 


nin2(S|/ni+9|/n2)3 

^ nin2(e^/ni+flj/n2)®  -I 


(2.4) 


(2.5) 


In  order  to  derive  (2.5),  we  may  note  that  the  transformation  is  equivalent  to 


Ati 

/^2 

CT2 


_n2^|_ 

^ ri2^3  + ni^l 

_ Wi6'i6i| 

^ n2^3  + ni^4 

^3 

O4 


so  that  the  Jacobian  matrix  is  given  by 


r 

c 

0 

0 

ri2^3+ni54 

n2^3+ni^^ 

1 

1 

0 

0 

2mn29i036l 

(n29l+m9l)'^ 

2nin29i^3^4 

(n29l+ni9'j)'^ 

1 

0 

2niTl2^1^q04 

L {n29l+ni9'\)'^ 

2nin29\9^9i 

{n29l+ni9'jy 

0 

1 

Now  I (9)  is  found  as  I (6)  = J7(/i)J^,  where  /(//)  expressed  in  terms  of  9 and 
/X  = (^1, /^2,  cTi,  0-2).  By  (2.4),  61  is  orthogonal  to  (02)^3,  ^4)  in  the  sense  of  Cox  and 
Reid  (1987).  Now,  from  Tibshirani  (1989)  and  Datta  and  J.K. Ghosh  (1995),  the  class 
of  first  order  probability  matching  priors  is  characterized  by 

tt^{9)  oc  (Oj/ni  + ^l/nz)  ' 5(^2,  &3,  ^4),  (2.6) 


where  g is  any  arbitrary  function  differentiable  in  its  arguments. 


18 


Due  to  the  arbitrariness  of  g,  the  class  of  priors  given  in  (2.6)  contains  infinitely 
many  members.  In  order  to  narrow  down  the  selection  of  priors  within  this  class, 
we  consider  second  order  probability  matching  priors  (see  Mukerjee  and  Dey,  1993; 
Mukerjee  and  Ghosh,  1997). 

To  this  end,  first  rewrite  the  likelihood  as 


L{0)  oc  ^3-"»04-”^exp 


OiOl/ni 


61/ ni  + el/n2, 


X exp 


6o  + 


6\6\/n2 


61/ ni  +6l/n2^ 


(2.7) 


Writing  I{6)  = ((lij)),  i,j  = 1,...,4,  and  following  Mukerjee  and  Ghosh  (1997), 


a prior  given  in  (2.6)  is  a second  order  probability  matching  prior  if  and  only  if  g 
satisfies 


93.  «4)  + E E 


where  = Eg  and  Lus  = Eg  , 

considerable  algebra, we  have 


E''g{62,63,e,)}  = 0 (2.8) 

s = 2,3,4.  From  (2.7),  after 


^1,1,1 

= 0 

-bll2 

= 0 

J^113 

(6± 

+ 

6J_ 

ni  ' 

n2 

^114 

(6± 

+ 

6[ 

n2  ' 

Ui 

«2 

-2 


-2 


Finally,  writing  the  inverse  matrix  of  I{0)  as  I ^{0)  = ((/*^)),  i,j  — 1,2, 3, 4 , 
the  resulting  inverse  matrix  I~^{0)  is  given  by 

rfi  + ^ 0 0 0 

n\  Ti2 


r\0)  = 


0 

0 


t22 


(n26l+n\ff\Y 

rv 

[n20l+nifflY 


n29iel9l  ^ 

2ni 

0 


n29i9^9^  ni9i9^9^ 

2 e|+ri  1^4)^  (n25|+ni9|) 


0 

iL 

2ti2 


19 


where  = 


6l9l  2nin2(ni+n2)O^6t0i 

n20'f+niel  {n29'^+ni6\Y 


With  these  substitutions  (2.8)  reduces  to 


n. 


'■A. 

dO. 


- + -)  ' 0lg(>2,Ke,) 

ni  712) 


+ n. 


-2  A 

^ dOi 


el  eiv~^ 


- + - W2,^3,^4) 

U2 


= 0 . 
(2.9) 


Consider  the  class  of  functions  g{-)  such  that 


'el  eiv 


g{92, 0,,  e,)  = k{e2)h{9,,  04)  ^ ^ 03-^0- 

Vni  ri2 


Then,  (2.9)  reduces  to 


Equivalently, 


^lA^h  ^A/-n 

n\  903  ^ ri2  904 


(2.10) 


A general  class  of  solutions  to  (2.10)  is  given  by 

M^3,^4)  = h*  , 

where  h*{-)  is  an  arbitrary  function  differentiable  in  its  arguments.  Hence,  a general 
class  of  solutions  to  (2.9)  is  given  by 


'91  9lV 


0(^2)  9^1  ^4)  — 9^9^  I h 

Vni  U2, 


u 


A _ 

. 91  91 


where  u(-)  is  an  arbitrary  differentiable  function.  However,  not  every  solution  is 
permissible  in  the  construction  of  priors.  For  instance,  since  the  prior  has  to  be 
nonnegative,  any  function  of  the  form  {n\/9l  — n\/ , k being  a positive  integer, 
needs  to  be  excluded  from  consideration.  However,  functions  of  the  form  {n\l9\  — 
n\l9\y^,  k being  a nonnegative  integer,  indeed  lead  to  proper  posteriors.  But  these 
posteriors  become  more  and  more  complex  as  k gets  larger.  Also,  there  does  not  seem 


20 


to  be  any  improvement  in  the  coverage  probabilities  with  these  complex  posteriors. 
Thus,  we  have  chosen  u{-)  to  be  a constant  function.  The  resulting  second  order 
probability  matching  prior  is  then  given  by 

TT^iO)  oc  e^^ef{el/ni  + el/n2)  . (2.11) 

In  the  original  parameterization  this  prior  becomes 

7T‘^(/ii,  //2,  (^2)  oc  + ol/n2)  . (2.12) 

Remark  Due  to  the  invariance  of  matching  priors  (Datta  and  Ghosh,  1996;  Mukerjee 
and  Ghosh,  1997),  remains  a second  order  probability  matching  prior  for  6i 

under  any  one-to-one  reparameterization  of  0 leaving  6i  unaffected.  We  may  note  also 
that  Jeffreys’  independent  prior  7T'^^(0)  a is  a first  order  probability  matching 

prior,  but  is  not  a second  order  probability  matching  prior  since  it  does  not  satisfy 
(2.9). 


2.2.2  HPD  Matching  Prior 


Writing  = ((T^)),  due  to  orthogonality  of  9\  with  (02,  ^3,  ^4),  from  equation 

(33)  of  DiCiccio  and  Stern  (1994)  or  equation  (4.1)  of  Ghosh  and  Mukerjee  (1995), 
the  HPD  equation  in  the  present  case  reduces  to 


90? 


1 7T 


=0, 


(2.13) 


where  Lm  = Eq  In  the  present  example,  Lm  = 0.  Also,  if  tt  does  not 

involve  0i,  the  first  term  in  the  left  hand  side  of  (2.13)  is  zero.  With  the  other 
necessary  substitutions,  (2.13)  simplifies  in  this  case  to 


+ ^4/^2) 


90, 


+ n; 


904 


0^03 /ni -t- 04/712)  7T 


0 . (2.14) 


21 


Clearly  the  prior  tt^{0)  found  in  (2.11)  satisfies  (2.14),  but  oc  (^3^4)  ^ does 

not. 


2.2.3  Matching  via  Inversion  of  Posterior  Bartlett  Corrected  Likelihood 
Ratio  Statistics 


Writing  Luj  = E 


log  / 


,2  < <4,  since  in  this  example  Luj  — 0 for  all 


_d0\d9id0j  _ 

2 < i,j  < 4,  the  differential  equation  leading  to  the  characterization  of  such  priors  is 
the  same  both  for  unconditional  and  conditional  Bartlett  adjusted  LR  statistic  (c/. 
Yin  and  Ghosh,  1997,  equations  (2.3)  and  (2.4)).  Further,  since  = 0,  it  follows 
from  Theorem  1 of  Yin  and  Ghosh  (1997)  that  the  second  order  probability  matching 
prior  derived  in  (2.11)  also  has  the  property  that  the  difference  between  Bayesian  and 
frequentist  Bartlett  corrections  for  the  conditional  likelihood  ratio  statistic  tends  to 
zero  as  — )■  00.  Thus,  the  prior  obtained  (2.11)  achieves  matching  in  this  way  as 

well.  In  contrast,  Jeffreys’  independent  prior  TT'^^{d)  oc  does  not  satisfy  either 

equation  (2.3)  or  equation  (2.4)  of  Yin  and  Ghosh  (1997),  and  is  not  thus  a matching 
prior  under  the  criterion  of  this  section. 


2.2.4  Jeffreys’  General  Rule  Prior 

From  (2.4),  the  determinant  of  I{0)  is  |/(0)|  = 4nfn2^^^04 Recall  that  Jeffreys’ 
general  rule  prior  is  proportional  to  the  square  root  of  the  determinant  of  the 
information  matrix /(0).  Hence 

7r-^^(0)  a efef 


which  is  different  from  the  Jeffreys’  independent  prior  proposed  by  Jeffreys. 


22 


2.2.5  Reference  Priors 


In  this  section,  we  derive  some  reference  priors  for  the  Behrens-Fisher  problem, 
keeping  9i  fixed  and  making  a one-to-one  transformation  of  all  the  remaining  pa- 
rameters. We  begin  with  the  derivation  of  two  group  reference  priors  following  the 
prescription  of  Berger  and  Bernardo  (1989).  Here  9i  is  the  parameter  of  interest, 
and  (^2,  ^3, 9i)  is  the  vector  of  nuisance  parameters.  The  two-group  reference  prior  is 
derived  from  the  following  four  steps: 


Step  1 7r(02,6>3,^4|0i)  = |T22(^)r^^  = 6^3  ^^4  + 6>^/n2)]^/^ 

Step  2 Consider  the  rectangular  compacts  A*  = {— i <9\<  i,  —i  < ^2  < b < ^3  < 
i,  < 6>4  < i},  i = 1, 2, ...  . Since  /C  7t(6>2,  9^,  ^4|0i)d6»2d03d6>4  = constant, 

ni{92, 9s,  9,\9,)  oc  7t(02,  ^3, 9,\9,)  oc  9f9f{9Hn^  + 9Hn2Yl^. 


Step  3 \I{6)\/\l22{^)\ ^ In  = {9l/ni  + 9j/n2)  F Hence, 


7Ti(6'i)  = exp  y TTi{92, 9^,  94\9i)  log  J~^d92d93(l9. 


= constant 


Step  4 


For  any  fixed  point  9io  of  9i 


lim 

i— )>oo 


Ki{94)7n{94) 

Kii94o)ni{94o) 


1 , 


where  Ki{9i)  -ij  /C  Jl-i  //_i  7t(02,  ^3, 94\9i)d92d93d94 


Hence,  from  steps  1-4,  the  two-group  reference  prior  is  oc  92^94^{9l/ni  + 

^4/^2)^^^  which  is  a first  order  probability  matching  prior,  but  is  not  either  a sec- 
ond order  probability  matching  prior,  or  a HPD  matching  prior  or  frequentist  and 
Bayesian  Bartlett  adjusted  matching  prior.  Similarly,  one  can  find  three  group 
and  one-at-a-time  reference  priors  following  the  algorithm  of  Berger  and  Bernardo 
(1992b).  Let  0(i)  = 9i,  0(2)  = (6*3, 6*4)  and  0(3)  = 92-  Then,  the  Fisher  information 
matrix  for  this  grouping  is  given  by 


23 


I*{0)  = 


(6>f/ni  + 6>|/ri2)  ^ 0^ '' 


where 


T* 

■*22 


2m 

If 


nin2{0l/ni+e'j/n2)^ 

nin2(fl3/ni+e^/n2)® 

26>i 


and  by  Datta  and  Ghosh  (1996) 


0 


4gfg3g4 

nin2(0'^/ni+9l/n2)^ 


‘22  J 


nin2(03/m+e^/Ti2)^ 
2gi 


|/ll(0)| 


S4(fl|/m+e^/n2) 


_ l-^[~oo]l  _ nin2 


26i 

2fli 

e4(0§/ni+e^/n2) 

t + t 


|/*2(0)|  = 


l■^hll]l  ri20i  + ni9l 
l-^hii]l  _ 4nin2 


|-^f~22]l  ^3^4 

|/^3(0)|  = 

From  the  algorithm  of  Berger  and  Bernardo  (1992b), 


-^[~22]  I _ ‘>^1  . ^2 

“ n 01 


”■3  = 4(®(3)I®|2)) 


i'>3(fl)y^ 


/ |h3(»)l'/^i9(3) 


oc  constant 


7T2  (0[~1]1^[1]) 


A (^[~2]|^[2])  exp  [(log  |/l2(g)|)  |^[2]]} 
J f exp  {I Ei  [(log  \h2{e)\)\0[2]] } d0(2) 


oc  0g  ^^4  ^ 


A (^hO]|0[O]) 


A (^[~i]l%])exp{|£;l  [(log|/ii(0)|)  |0[1]]} 
J/exp  [(log  |/ii(0)|)|0[i]] } d0p) 


(2.15) 


24 


OC  4 (0[^i]|0[i]) 


OC  ^3  ^^4  ^ 


Hence,  the  three  group  reference  prior  for  {^3, 0^},  {^2}  where  the  parameters  are 
grouped  according  to  their  order  of  importance  is  7t^^(0)  a which  is  Jeffreys’ 

independent  prior,  but  is  different  from  the  Jeffreys’  general  rule  prior.  Next,  consider 
the  one-at-a-time  reference  prior.  Let  0(i)  = 9i,  0(2)  = 9^,  0(3)  = ^4  and  0(3)  = 02- 
Then,  the  Fisher  information  matrix  for  this  grouping  is  same  as  (2.15),  and  following 


Datta  and  Ghosh  (1996),  we  get 


|/ii(0)| 

\h2{e)\ 

\h,{e)\ 

1/14(0)1 


-^[~00]  I _ ^1^2 

n2^3  + ni^4 

^h22]l  " n 
-^[~22]  I _ 2n2 

.-^[~33]l  _ ^1  ^2 

I^h44]l  " 01  91  ■ 


Also,  from  the  algorithm  of  Berger  and  Bernardo  (1992b), 


7'’^0[~3)]|0[3])  = 7T^0(4)10[3]) 


/i/l4(0)|V2d0(4) 


OC  constant 


A (^[~3]|^[3])  exp{|F;^  [(log  I /13 (0)1)  |0[3]]} 
/exp  [(log  |/i3(0)l)|0[3]] } d0(2) 


7^3  (^[~2]|0[2]) 


25 


OC  ^4  ^ 


A 1 

(6>[~2]|6>[2]) 

1 exp  1 

(log  1^2(0)  1)  1%]} 

Jexp{\Ei  [(log 

:|h3(e)l)|9[2i] 

} dO{i) 

OC  0g  ^^4  ^ 


A 1 

l)exp 

(iogift,(e)|)|e[i,]} 

//exp{; 

li!![{log|h,(9)|) 

l^w] 

} dO(i) 

OC  7T^0[^i]|0[i]) 


OC  ^3  ^^4  ^ 

Under  the  groupings  {^i},  {^4},  {^s},  {^2})  the  same  prior  is  obtained  as  one-at-a-time 
reference  priors.  Other  groupings  are  possible,  but  these  do  not  yield  analytically 
tractable  reference  priors.  The  catalog  of  reference  priors  is  given  in  Table  2.1. 


Table  2.1:  Catalog  of  reference  priors  for  Behrens-Fisher  Problem 


Grouped  parameters  in 
their  order  of  importance 

Prior  distribution 

{^2, 03,  ^4} 

OC  03^ 9^^ {el/ Pi  + 91/ P2f'^ 

{^1},  {^3)  ^4})  {^2} 

7t3G(0)  a ^3-^04-^ 

{^1})  {^3})  {^4},  {^2} 

7t^^(6>)  OC  9^9^^ 

{^4},  {^3},  {^2} 

7t1^(6>)  OC  9^9^ 

26 


2.3  Propriety  of  Posteriors  and  Simulation  Study 


As  we  mentioned  in  previous  section,  Jeffreys’  independent  prior  is  the  same 

as  some  reference  priors  such  as  and  We  denote  this  prior  as  . The  first 

task  is  to  verify  the  propriety  of  posteriors  under  the  noninformative  priors  tt-^, 
and  7T^ . To  this  end,  we  find  it  convenient  to  go  back  to  the  original  parameterization 
{(J-i,  fJ.2,  (^1,(^2)-  The  joint  posterior  of  and  ct2  given  Xi,X2  under  the  prior 

7T^,  and  TT^  are  given  respectively 

{2  ni  n2 

Z<Ti  ZO-2 


X 


-(m+1)  -(n2+l) 

(Tl  (72 


(2.16) 


11  ni  1 n2 


(2.17) 


{1  ni  1 «2  1 


X (Tl  ^"^“^^^2  + (72/712)  . 


(2.18) 


From  (2.16),  integration  with  respect  to  a\  and  (T2  yields 


7r^(/Lti,p2|a;i,iC2)  oc 


riiifii  - Xif 
(ni  - l)s? 


1 + 


n2{H2  - X2)‘‘ 
(ri2  - 1)S2 


, 


(2.19) 


Thus,  the  joint  posterior  of  /ii  and  //2  based  on  the  prior  tt^  is  a product  of  two 
independent  t’s  with  respective  location  parameters  X\^X2,  scale  parameters 
and  S2/  and  degrees  of  freedom  ni  — 1 and  ri2  — 1.  This  posterior  is  proper  for 
ni  > 2 and  U2  > 2. 


27 


Similarly,  from  (2.17),  integration  with  respect  to  a\  and  yields 


7T-^*^(/il,/r2|Xi,X2)  oc 


ni(/xi  - xif 
(ni  - l)sf 


ni+1 

2 


n2(/T2  - X2f 
(ri2  - l)si 


n2  + l 
2 


(2.20) 


Thus,  the  joint  posterior  of  /xi  and  /i2  based  on  the  prior  is  a product  of  two 
independent  t's  with  respective  location  parameters  Xi,^2,  scale  parameters  Si/y^ 
and  sij and  degrees  of  freedom  rii  and  U2.  This  posterior  is  proper  for  ni  > 1 
and  ri2  > 1 

Next,  from  (2.18),  integration  with  respect  to  ai  and  ct2  yields 


-1 


7T  (/ii,/i2|aJi,a;2)  oc  Cn-^ 


+ 


m 


^{xij  - 
m 


Tl2 


5I(^2j  - At2)' 

j=l 


n v+2 


712 


ri2  ^2 
2 


-1 

2 


Xl(^2j  - A^2)^ 

J=1 


C2.21) 


where  the  constant  C depends  on  xi,X2,rii  and  ri2,  but  not  on  the  parameters 
(/xi, //2,  CTi,  (T2)-  We  rewrite  the  right  hand  side  of  (2.21)  as 


C 


1 + 


1 /ni(ni-l)' 


X. 


/ V ^2  + 2 

(1,  !>yi) ■ 


1 I n2(fi2-X2)'^ 

(n2  + l)(u2/(fJ2+l)) 


^2+2 

2 


Al+TB(i  ^ 


1 ^Tl2in2+T) 


U2 


Ui 


ni  +2 
2 


(ni+l)(ui/(m+l)) 


ni+2 

^ 1 


X 


n2(/X2-J2)^ 

(n2-l)(u2/(n2-l)) 


1 


where  Ui  = T.]'=i{xij  ~ a^i)^,  (x  = 1,2)  and  B{a,h)  = Now,  after  some 

simplifications,  one  can  write 


7T  (P'1,  M2  |a!l,a52)  ‘^/xj,uJ/2(„j(„j_i))-5,ni-l^^^^-^X2,ti2^^n2(n2+l)) 


28 


+ (2.23) 

where  f(t>i,ci>2,u{')  denotes  the  pdf  of  Student’s  t with  location  parameter  (j)i,  scale 
parameter  (p2  and  degrees  of  freedom  u,  and 

ui{ni  - 1)~^ ^ si 

ui(rti  - 1)“^  + U2(n2  - 1)“^  sf  + S2 

The  propriety  of  the  posterior  fi2,  (^i,  cr2|a;i,  X2)  follows  immediately  from  (2.19) 

and  (2.23)  for  rii  > 2 and  ri2  > 2. 

Next  we  prove  a theorem  which  establishes  the  symmetry  and  unimodality  of  the 
posterior  of  /xi  — fi2  around  Xi  — X2  under  the  priors  tt^,  and 

Theorem  2.1  The  posterior  of  fxi  — ^2  is  symmetric  and  unimodal  about  Xi  — x^ 
under  the  priors  tt-^,  and 

Proof  of  Theorem  2.1  From  (2.19),  (2.20)  and  (2.23),  it  is  easy  to  check  that 
(/xi,/i2)  is  symmetric  about  {xi^x^)  under  the  priors  tt^,  and  tt'^.  This  implies 
that  /xi  — p2  is  symmetric  about  x\  — X2  under  these  priors.  In  order  to  prove 
unimodality,  first  under  the  priors  tt^  and  , we  observe  that  the  posterior  pdf’s 
fjii  — Xi  and  p2  ~ ^2  are  both  symmetric  and  unimodal  about  0 on  the  real  line.  Since 
the  concepts  of  symmetric  unimodality  and  central  convex  unimodality  coincide  on 
the  real  line,  both  these  posteriors  are  central  convex  unimodal  about  0.  Due  to  the 
independence  of  these  two  posteriors,  from  Theorem  2.4  , pp  45-46  of  Dharmadhikari 
and  Joag-dev  (1988),  the  joint  posterior  of  (/ii  — Xi,H2  —^2)  is  unimodal  about  (0, 0). 
Now,  from  Theorem  2.15,  p 59  of  Dharmadhikari  and  Joagdev  (1988),  central  convex 
unimodality  implies  linear  unimodality,  and  hence  (/ii  — ^2)  — (^1  — ^2)  is  unimodal 
about  0,  or  equivalently  /xi  — H2  is  unimodal  about  X\  — X2. 

In  order  to  prove  the  unimodality  of  /xi  — /i2  about  xy  — x^  under  the  prior  tt^,  first, 
as  before,  the  joint  posteriors  / 1/2,  , ,(^i)/_  1/2,  , (/^a) 

’ ^ H,«i  (ni(ni-l))  (n2(n2+l))  2,ri2+l' 


29 


and/  1/2,  , 1/2,  , (uo)arebothcentralconvexuni- 

modal  about  {xi,X2).  Hence,  the  convex  combination  (2.23)  is  also  central  convex 
unimodal  about  {xi,X2).  The  same  Theorem  2.15  of  Dharmadhikari  and  Joag-dev 
now  establishes  unimodality  of  /ri  — /i2  about  Xi  — X2- 


Next  we  undertake  some  simulation  study  to  find  the  frequentist  coverage  prob- 
abilities of  the  Bayesian  credible  intervals  for  /xi  — /X2  based  on  the  priors 
and  TT^.  We  first  generate  samples  Xi,X2  of  sizes  ni  and  U2  for  a given  choice  of 
iHi,  fj.2,  (^1,(^2)-  A variety  of  {ni,ri2)  combinations  are  considered,  in  some  instances, 
ni  and  U2  are  close,  while  on  other  occasions,  they  are  quite  dispersed.  Then,  from 
the  joint  posterior  derived  in  (2.19),  (2.20)  and  (2.23),  we  generate  samples  1,000 
times,  compute  fxi  — fi2  each  time,  and  find  numerically  the  5%  and  95%  posterior 
quantiles  of  /xi  — /X2-  Due  to  the  symmetry  and  unimodality  of  the  posterior  pdf’s 
of  fXi  — /i2  about  Xi  — X2  under  the  priors  tt^,  and  , we  consider  equal  tailed 
credible  intervals  and  check  whether  or  not  the  actual  difference  /xi  — //2  belongs  to 
this  interval.  The  whole  process  is  repeated  5,000  times,  and  we  find  the  proportion 
of  times  the  true  difference  fii  — ^2  belongs  to  this  interval.  This  is  the  estimated 
frequentist  coverage  probability  of  the  Bayesian  credible  interval.  The  results  are 
given  in  Tables  2.2  and  2.3  , respectively.  The  computing  was  done  on  Sun  SparclO 
workstation  using  interface  FORTRAN??.  All  random  numbers  are  generated  by 
FORTRAN??  subroutine  based  on  Uniform{0, 1)  random  variable. 


The  findings  of  Tables  2.2  and  2.3  are  quite  impressive  for  the  prior  when 
compared  to  the  priors  and  especially  for  small  and  moderate  values  of  rii 
and  ri2.  For  the  cases  rii  = 2 and  U2  = 20  or  ni  = 20  and  ri2  = 2 , the  coverage 
probabilities  of  the  Bayesian  credible  intervals  based  on  the  second  order  probability 
matching  prior  meet  the  frequentist  target  (here  90%)  better  than  Jeffreys’  indepen- 
dent (or  the  one-at-a-time  reference)  prior.  Also,  for  all  the  cases  considered  here. 


30 


Table  2.2:  Frequentist  probabilities  of  0.05  (0.95)  posterior  quantiles  of  when 
\ix  — 2.0, /X2  = 0.0  and  a\  = 1.0  (ni  > ri2) 


II 

CM  .-1 

fcl 

2.0 

= 

4.0 

ni 

W2 

7T 

0.05 

0.95 

0.05 

0.95 

2 

2 

7T^ 

0.007 

0.991 

0.010 

0.989 

0.085 

0.915 

0.086 

0.913 

7T^ 

0.017 

0.979 

0.024 

0.979 

20 

2 

7T^ 

0.036 

0.803 

0.029 

0.822 

^JG 

0.100 

0.902 

0.077 

0.922 

7T^ 

0.110 

0.856 

0.086 

0.878 

3 

2 

7T« 

0.013 

0.988 

0.011 

0.987 

^./G 

0.059 

0.928 

0.064 

0.928 

0.028 

0.970 

0.030 

0.970 

5 

3 

7T^ 

0.021 

0.976 

0.025 

0.978 

0.054 

0.945 

0.054 

0.944 

7T^ 

0.038 

0.958 

0.042 

0.960 

7 

5 

7T^ 

0.031 

0.969 

0.034 

0.959 

^JG 

0.047 

0.949 

0.051 

0.943 

7T^ 

0.041 

0.955 

0.045 

0.948 

12 

8 

7T^ 

0.041 

0.962 

0.043 

0.956 

^7G 

0.051 

0.952 

0.053 

0.948 

7T^ 

0.051 

0.952 

0.050 

0.950 

15 

10 

7T« 

0.043 

0.960 

0.048 

0.956 

^JG 

0.051 

0.950 

0.056 

0.948 

7T^ 

0.051 

0.954 

0.053 

0.952 

20 

15 

7T^ 

0.046 

0.958 

0.044 

0.955 

^JG 

0.051 

0.953 

0.048 

0.951 

7T^ 

0.052 

0.953 

0.049 

0.951 

30 

20 

TT^ 

0.049 

0.951 

0.047 

0.953 

^JG 

0.051 

0.945 

0.051 

0.950 

7T^ 

0.051 

0.948 

0.051 

0.950 

31 


Table  2.3:  Frequentist  coverage  probabilities  of  0.05  (0.95)  posterior  quantiles  of  9\ 
when  Hi  = 2.0,  H2  = 0.0  and  cr|  = 1.0  (ni  < 7^2) 


II 

cs  w 
b 

2.0 

a'i  = 

4.0 

rii 

n2 

7T 

0.05 

0.95 

0.05 

0.95 

2 

3 

0.015 

0.983 

0.023 

0.977 

^JG 

0.085 

0.919 

0.092 

0.901 

7T^ 

0.036 

0.965 

0.042 

0.952 

2 

20 

7T^ 

0.036 

0.788 

0.043 

0.788 

^JG 

0.120 

0.880 

0.123 

0.861 

TT^ 

0.110 

0.819 

0.104 

0.819 

3 

5 

7T^ 

0.033 

0.968 

0.033 

0.964 

^JG 

0.068 

0.933 

0.074 

0.924 

7T^ 

0.050 

0.950 

0.050 

0.945 

5 

7 

7T« 

0.039 

0.963 

0.045 

0.962 

^JG 

0.056 

0.943 

0.064 

0.940 

TT^ 

0.049 

0.952 

0.052 

0.952 

8 

12 

7T« 

0.046 

0.953 

0.044 

0.956 

0.058 

0.941 

0.057 

0.944 

7T^ 

0.055 

0.945 

0.051 

0.949 

10 

15 

7T« 

0.044 

0.952 

0.044 

0.949 

0.052 

0.944 

0.053 

0.943 

7T^ 

0.051 

0.948 

0.048 

0.948 

15 

20 

7T« 

0.046 

0.955 

0.047 

0.953 

^JG 

0.054 

0.949 

0.052 

0.947 

7T^ 

0.053 

0.951 

0.049 

0.952 

20 

30 

7T^ 

0.048 

0.953 

0.048 

0.952 

0.052 

0.950 

0.051 

0.946 

7T^ 

0.051 

0.950 

0.049 

0.951 

32 


the  coverage  probability  of  the  one-sided  credible  interval  based  on  the  second  order 
probability  matching  prior  is  closer  to  the  frequentist  target  at  the  upper  5%  level. 
The  same  is  true  for  the  lower  5%  level  as  well  except  for  the  cases  rii  = 2 and  = 20 
or  rii  = 20  and  ri2  = 2.  More  important,  there  is  noticeable  differences  between  the 
two  approaches  for  small  and  moderate  values  of  rii  and  ri2.  Finally,  one  does  not 
require  very  large  ni  and  U2  for  these  priors  to  be  useful. 


2.4  Conditional  Frequentist  Properties 


As  mentioned  in  the  introduction,  the  Welch-Aspin  solution  has  been  criticized 
by  Robinson  (1982)  because  this  may  lead  to  a negatively  biased  relevant  subset  in 
the  sense  of  Buehler  (1959).  To  be  specific,  if  Jwa(-X’i,  X2)  denotes  the  Welch-Aspin 
intervals  for  — /i2,  then  there  exists  some  subset  C of  the  sample  space  such  that 


/xi-/i2e/w.i(Xi,X2)|(Xi,X2)GC  <l-Q-e,  (0<e<l-cr)  (2.24) 


for  all  and  02-  On  the  other  hand,  Robinson  (1976)  showed  that  credible 

intervals  based  on  Jeffreys’  independent  prior  does  not  admit  any  negatively 
biased  relevant  set  in  the  sense  of  (2.24).  Since  the  credible  interval  based  on  Jeffreys’ 
independent  prior  is  algebraically  equivalent  to  the  original  Behrens-Fisher  solution, 
it  follows  that  the  latter  is  more  satisfactory  than  the  Welch-Aspin  solution  from  a 
conditional  frequentist  perspective  since  the  conditional  coverage  probabilities  of  the 
Behrens-Fisher  intervals  are  never  smaller  than  the  nominal  confidence  level. 

The  alternate  prior  tt^  also  does  not  admit  any  negatively  biased  relevant  set. 
This  can  be  proved  by  mimicking  the  arguments  of  Robinson  (1976)  and  by  modifying 
appropriately  the  sequence  of  priors  considered  by  him.  The  main  point  is  that  any 
Bayesian  credible  interval  for  /ii  — /i2  usually  has  this  nice  conditional  property  as 
long  as  it  contains  Xi  — X2- 


33 


2.5  Concluding  Remarks 

In  this  chapter,  we  have  derived  a new  prior  different  from  Jeffreys’  independent 
and  general  rule  priors  for  the  Behrens-Fisher  problem.  This  prior  possesses  good 
frequentist  properties  in  that  the  coverage  probabilities  of  credible  intervals  for  the 
difference  of  two  normal  means  based  on  this  prior  match  their  frequentist  counterpart 
very  closely  even  for  small  and  moderate  sample  sizes.  The  proposed  prior  is  also 
very  satisfactory  from  a conditional  frequentist  perspective. 

Default  priors  are  being  routinely  used  by  Bayesians  (and  often  by  non-Bayesians 
as  well)  to  analyze  real  life  data.  We  do  not  expect  that  there  will  be  a single  default 
prior  which  will  stand  out  well  above  the  others  on  every  single  occasion.  Based  on 
the  four  criteria  as  mentioned  in  the  introduction,  there  could  be  different  default 
priors  each  optimal  according  to  one  of  these  criteria.  On  occasions,  the  one-at-a- 
time  reference  prior  emerges  as  the  optimal  one  both  according  to  criteria  (i)  and 
(ii).  Such  examples  are  found  in  Tibshirani  (1989),  Mukerjee  and  Dey  (1993)  and 
Datta  and  Ghosh,  M.  (1995).  In  these  instances,  clearly  one  should  prescribe  the 
one-at-a-time  reference  prior  as  the  default  prior. 

The  situation  is  not  so  clear-cut  for  the  Behrens-Fisher  problem.  It  is  well-known 
in  this  case  that  the  Welch-Aspin  frequentist  solution  is  not  satisfactory  even  from 
a conditional  frequentist  perspective,  while  Behrens’  solution,  though  algebraically 
equivalent  to  the  solution  provided  by  Jeffreys’  independent  prior,  has  not  found 
appeal  to  frequentists.  One  major  motivation  behind  the  present  work  was  to  find  (if 
possible)  a prior  which  should  be  of  appeal  to  frequentists,  conditional  frequentists 
as  well  as  default  Bayesians.  It  seems  to  us  that  the  proposed  second  order  matching 
prior  meets  this  end  quite  well. 

Clearly  there  are  possible  extensions  of  the  proposed  work.  One  possible  extension 
is  for  the  ANOVA  model  where  homoscedasticity  fails,  and  one  is  interested  in  finding 
a suitable  confidence  interval  for  a treatment  contrast.  A different  but  related  problem 


34 


is  to  find  a confidence  interval  for  the  common  mean  of  several  independent  normal 
populations  with  possibly  unequal  variances.  Development  of  default  priors  in  these 
cases  should  be  of  appeal  to  both  Bayesians  and  frequentists.  We  plan  to  address 
some  of  these  problems  in  later  studies. 


CHAPTER  3 

INTERVAL  ESTIMATION  OF  THE  COMMON  MEAN  OF  SEVERAL  NORMAL 

POPULATIONS 


3.1  Introduction 


Estimation  of  the  common  mean  of  several  normal  populations  has  received  at- 
tention for  several  decades.  The  problem  is  quite  natural  in  balanced  incomplete 
block  designs  with  uncorrelated  random  block  effects  and  fixed  treatment  effects  f 
and  the  interblock  estimator  r of  a treatment  contrast  are  independent  normal  with 
a common  mean  r,  but  their  variances  are  unknown  and  unequal.  A second  exam- 
ple relates  to  meta  analysis  where  for  example,  several  clinics  provide  estimates  of  a 
common  parameter  of  interest,  and  the  problem  is  how  to  combine  these  estimates 
meaningfully  into  a single  one. 

Point  estimation  of  the  common  mean  has  been  addressed  both  from  the  classical 
and  decision  theoretic  points  of  view.  Among  others,  we  may  refer  to  Graybill  and 
Deal  (1959),  Zacks  (1966,1970),  Khatri  and  Shah  (1974),  Shinozaki  (1978),  Cohen 
and  Sackrowitz  (1974),  Brown  and  Cohen  (1974),  Bhattacharya  (1980),  Sinha  and 
Mouqadem  (1982),  Sinha  (1985)  and  Kubokawa  (1990). 

In  contrast,  relatively  less  attention  has  been  paid  to  interval  estimation  of  the 
common  mean.  Approximate  confidence  intervals  are  found  in  Meier  (1953),  Brown 
and  Cohen  (1974),  Sinha  (1985)  and  Eberhardt,  Reeve  and  Spiegelman  (1989).  Ex- 
act intervals  are  proposed  in  Fairweather  (1972),  Cohen  and  Sackrowitz  (1974)  and 
Jordan  and  Krishnamoorthy  (1996).  More  recently,  Sinha  (1998)  and  Yu,  Sun  and 
Sinha  (1999)  have  these  intervals  and  others  exact  intervals  compared  by  their  lengths 
when  the  confidence  coefficients  remain  the  same. 


35 


36 


From  the  examples  considered  in  Yu,  Sun  and  Sinha  (1999),  it  appears  that  none 
of  the  frequentist  procedures  emerges  as  a clear  winner.  Indeed,  in  one  of  the  two 
examples  considered  in  their  paper,  the  Fairweather  interval  is  the  shortest,  while 
in  the  second  example  the  inverse  normal  interval  is  the  shortest.  Thus,  a unified 
frequentist  method  for  finding  an  exact  confidence  interval  of  the  common  mean 
seems  to  be  lacking. 

In  this  chapter,  we  offer  an  alternative  approach  by  introducing  Bayesian  priors 
such  that  the  resulting  credible  intervals  for  the  common  mean  have  coverage  probabil- 
ities asymptotically  equivalent  to  their  frequentist  counterparts.  Though  asymptotic, 
our  simulation  results  indicate  that  this  matching  holds  even  for  small  and  moderate 
sample  sizes.  Also,  for  both  the  examples  considered  by  Yu,  Sun  and  Sinha  (1999), 
the  lengths  of  the  Bayesian  intervals  are  nearly  equivalent  to  the  shortest  frequentist 
intervals  among  those  considered  in  Yu,  Sun  and  Sinha  (1999).  While  the  appeal  of 
a Bayesian  procedure  is  not  just  by  judging  its  frequentist  property,  in  the  absence  of 
prior  elicitation,  one  is  somewhat  compelled  to  use  default  priors,  and  one  criterion 
for  choosing  such  priors  is  by  judging  their  frequentist  properties. 

The  outline  of  the  remaining  sections  is  as  follows.  In  Section  2,  we  introduce 
several  criteria  of  matching  priors,  and  obtain  one  which  stands  out  particularly  well 
under  any  of  these  criteria.  Propriety  of  the  posterior  under  this  prior  is  proved  in 
Section  3.  A small  simulation  study  is  undertaken  in  Section  4 which  shows  that 
the  second  order  probability  matching  prior  developed  in  this  chapter  matches  the 
frequentist  target  coverage  probabilities  better  than  the  reference  prior  especially  for 
small  or  moderate  samples.  Section  5 contains  several  confidence  intervals  that  are 
proposed  from  a frequentist  point  of  view.  Also,  in  this  section,  the  length  of  the 
Bayesian  credible  interval  is  compared  against  all  its  frequentist  competitors  for  two 
real-life  examples  considered  in  Yu,  Sun  and  Sinha  (1999). 


37 


3.2  Noninformative  Priors 

Suppose  there  are  k{>  2)  independent  normal  populations  with  unknown  common 
mean  and  with  unknown  and  possibly  unequal  variances  We  consider 

samples  of  sizes  n,-  = Npi  {i  = where  each  > 2 and  Yli=iPi  = 1-  The 

sample  observations  are  denoted  by  Xij  [j  = = l,...,k).  The  minimal 

sufficient  statistic  is  {Xi,  ...,Xk,  Sf,  ...Si)  where  X{  = and  Sf  = (nj  — 

~ Xj)^.  The  likelihood  function  is  given  by 


L{ii,cri,...(Jk)ix  (^nCTi"‘)exp  - + (jlj  - 1)S|| 


(3.1) 


Simple  calculations  yield  the  Fisher  information  matrix  as 


I{p,ai,...ak)  - Diagonali  2ni(Ji  2nfccrfc  ^ j . 

i=l  ' 


(3.2) 


We  now  consider  several  noninformative  priors  all  having  their  roots  in  the  Fisher 
information  matrix  given  in  (3.2)  in  some  form  or  the  other. 


3.2.1  Jeffreys’  Prior 

Jeffreys’  prior  is  proportional  to  the  positive  square  root  of  the  determinant  of 
the  /(//,  cTi,  ...cTfc).  Thus, 


n-'in,  <7i, 


(3.3) 


38 


3.2.2  Reference  Priors 

First  we  consider  the  two-group  reference  prior.  The  parameter  space  is  parti- 
tioned into  {fi},  {(Ji,  ...ak},  n being  the  patameter  of  interest.  Let  0(i)  = ^ and 
^(2)  = (c^i)  Then,  the  Fisher  information  matrix  is  written  by 


I {6)  — Block  Diagonal  I hi{d), 


where 

1=1 

^2(0)  = Diagonal i 2niai^,  ...,2nk(Tk^ 


Hence,  from  Theorem  1 of  Datta  and  M. Ghosh  (1995),  the  two-group  reference  prior 
is  given  by 

k 

7r^^(/i,CTi,...,cTfc)  oc  . (3.4) 

1=1 

Next  consider  the  one-at-a-time  reference  prior  with  9 = {9i, ...,  9p)  = (/x,  cri,  ...ajt). 
Then,  from  the  Lemma  of  Datta  and  Ghosh  (1996), 

ift.wi  = = 

i=i 

-f[~ii]|  _ 2^ 

F[~22]|  (^1 

_ 2nfc-i 
^k-l 

|-^[~fcfc]|  _ 2nfc 


Md)\ 

\hk{9)\ 


\hk+i{9)\ 


39 


Also,  following  the  algorithm  of  Berger  and  Bernardo  (1992b)  and  Datta  and 
Ghosh  (1996)  as  the  previous  chapter,  the  one-at-a-time  reference  prior  for 
Wk}  is 

k 

7r^(/X,(7i,  ...,(7fc)  OC  n (3.5) 

t=l 


3.2.3  Quantile  Matching  Priors 

Due  to  the  orthogonality  of  //  with  ((Ti,  ...,  <Tfc)  (c/.  Cox  and  Reid,  1987),  from 
Tibshirani  (1989)  , the  class  of  first  order  probability  matching  priors  is  characterized 
by 

7r^(/i,<Ti,  ...,(7fe)  OC  ( p(cTi,...,crfc),  (3.6) 

where  g is  any  arbitrary  positive- valued  function  differentiable  in  its  arguments.  Thus 
both  Jeffreys’  prior  and  the  one-at-a-time  (or  two-group)  reference  priors  are  first 
order  probability  matching  priors.  As  one  may  note,  from  (3.6),  any  smooth  function 
of  <Ti, ...,  (Tfc  differentiable  in  its  arguments  is  a first  order  probability  matching  prior. 
This  class  of  priors,  however,  is  too  large  to  be  useful  in  practice.  Hence,  there  is 
a need  to  narrow  down  the  selection  of  priors  within  this  class,  and  in  particular, 
develop  second  order  probability  matching  priors. 

From  Mukerjee  and  Ghosh  (1997),  due  to  the  orthogonality  of  /x  with  (cti,  ...,<7^), 
this  class  of  priors  is  characterized  as  solutions  of  the  second  order  partial  differential 
equation 


where  = E 


( a log  h 


/X,  (7i,  ...,  (Tfc 


and  Lilt  = ^ 


log  L 
dij?dai 


/X,  (Ti,  ...,  CTj. 


, i = 1,  ...k. 


From  (3.1),  after  some  algebra,  we  have 


40 


— 0 


L\\i  n ■)  i 1 

(yf 


Hence,  (3.7)  simplifies  to 


(3.8) 


A general  solution  to  (3.8)  is  difficult  to  find.  However,  one  particular  solution 


Also,  it  is  easy  to  see  from  (3.8)  that  neither  Jeffreys’  prior  nor  the  reference 
prior  7T^  is  a second  order  probability  matching  prior. 

3.2.4  HPD  Matching  Prior 

Due  to  the  orthogonality  of  fi  with  (ai, ...,  (T*,),  from  equation  (33)  of  DiCiccio  and 
Stern  (1994)  or  equation  (4.1)  of  Ghosh  and  Mukerjee  (1995),  a prior  tt  is  a HPD 
matching  prior  if  and  only  if  it  satisfies 


is  given  by  g{ai,...,ak)  oc  ^ nf=i cr* j . Accordingly,  one  particular  second  order 
probability  matching  prior  is  given  by 


(3.9) 


(3.10) 


41 


where  Lm  = E 


dn^ 


fx,ai,...,ak  . Since  Lm  = 0 in  this  case,  and  the  first  term  in 
the  left  hand  side  of  (3.10)  is  0,  for  any  prior  tt  of  the  form  (3.6),  one  needs  to  solve 
(3.8)  once  again.  Thus,  -k^  is  a HPD  matching  prior,  but  tt-^  or  are  not. 


3.2.5  Matching  via  Inversion  of  Posterior  Bartlett  Corrected  Likelihood 
Ratio  Statistics 

It  follows  from  equations  (2.3)  and  (2.4)  of  Yin  and  Ghosh  (1997)  that  the  differen- 
tial equation  leading  to  the  characterization  of  such  priors  is  the  same  for  both  uncon- 
ditional and  conditional  Bartlett  adjusted  likelihood  ratio  statistic.  Since  = 0 
in  this  case,  it  follows  from  Theorem  1 of  Yin  and  Ghosh  (1997)  that  the  second  order 
probability  matching  prior  derived  in  (3.9)  also  has  the  property  that  the  difference 
between  Bayesian  and  frequentist  Bartlett  corrections  for  the  conditional  likelihood 
ratio  statistic  tends  to  zero  as  oo.  This  property  is  not  shared  by  either  or 


3.3  Propriety  of  Posteriors 

Since  the  priors  tt^,  and  are  all  improper,  the  first  thing  is  to  establish  the 
propriety  of  posteriors  under  these  priors.  This  is  what  we  accomplish  in  this  section. 
Indeed,  if  the  posterior  is  improper,  it  is  pointless  to  talk  about  descriptive  measures 
such  as  the  mean,  median,  quantiles,  as  well  as  point  estimates,  credible  intervals  etc. 

We  begin  with  tt^.  Then,  the  joint  posterior  of  /i,  (Ti,  ...,  <7^  given  X based  on  the 
reference  prior  tt^  is 


7T^(/i,  (Ji,  ...,  CTk\X  ) oc  n 

j=i 


-(nj+l) 

j ^ exp 


- 5 E ~^jf  + (nj  - i)S] } 


(3.11) 


Next,  integrating  with  respect  to  the  marginal  posterior  of  //  given  X 

based  on  the  reference  prior  tt^  is 


42 


This  implies 


k r 


n^{n\X)  oc  n 

j=i  ■- 


njil^  - Xj)^  + (rij  - l)S] 


k r 

(X  n 

j=i  '■ 


Wj  (/X  Xj ) 


7T^(/i|X)  < C 


1 + 


ni(/i  - Xi)^1  2"i 

(ni  - l)5f  . 


(3.12) 


(3.13) 


where  C{>  0)  depends  on  [Xi,  ...,Xk,Sl,  ...,Sl),  but  not  on  /i.  Now,  the  finiteness 
of  7T^(^|X)d/x  follows  by  recognizing  the  right  hand  side  of  (3.13)  as  a constant 
multiple  of  a Student’s  t-pdf. 

Next,  we  prove  the  propriety  of  7r‘^(/x,  cti,  ...,  cr^lX)  based  on  the  second  order 
probability  matching  prior  . First,  we  write 


7r^(/i,(7i,  ...,CTfe|X)  oc  exp 


-\Y.  ~ Xmf  + {n^  ~ 1)5^1 

^ m=l  ’ 


X 


k / k 


i=i 


rijOj  2 


j=i 


= exp 


-\Y.  ~ X^f  + {Um  ~ 1)5^} 

L ^ m=l  '' 

{nyr'”*-"}  . 


X 

j=l 


(3.14) 


Next,  integrating  with  respect  to  <7i,  ...,a*;,  the  marginal  posterior  of  fj,  given  X 
based  on  the  second  order  probability  matching  prior  is 


7T 


'()u|X)  OC  Y^nArijin- XjY + {rij -1)S]\ 

.7  = 1 ''  ^ 


43 


fc  /-  _ -V  -i(rii-2) 

X n |ni(/i-Xj)^  + (rii  - l)5f[ 

= E - 1)*^! } n - 1)*^^} 

j=i  '■  j ''  ^ 


X 


{nj  - 1)5?  . 


n 


1 + 


- Xj)^ 

{Ui  - 1)5? 


-|(ni-2) 


(3.15) 


Thus, 


7T 


^(/i|X)  <a^ni(n,-l)5|r'"^n{(n,-l)5f| 

j=i  ''  ^ 


-i(n;-2) 


n. 


(«j  - l)*^?  J 


2n-in 


2 


(3.16) 


where  C»(>  0)  depends  on  the  data,  but  not  on  fi.  Once  again,  from  (3.16),  the 
finiteness  of  7r^(//|X)d/x  follows  from  the  finiteness  of  t-integrals. 

Finally,  the  propriety  of  7r-^(/r,  (Ti,  ...,  (Tfc|X)  follows  by  arguments  similar  to  those 
used  in  proving  the  finiteness  of  /q°°  • • • /o°°  cti,  ak\X)dfjLdai...dak. 


3.4  Simulation  Study 


We  compare  in  this  section  the  two  priors  tt^  and  by  calculating  the  frequen- 
tist  probabilities  P[fx  < X)|//,  (Ti,  1x2]  and  P[/x  < fj}~°‘{ir^ , X)\iJ,,ai,a2].  We 

write  F"«(.2)  = < z\X)  andF^^{z)  = P^^{n  < z\X)  . Thus,  X)) 


= l-a  = F’^«(/ri-“(7T^;X)). 

Let  P(o;;7r^,/x,  (Ti,(T2)  = P(/^  < X)  and  P(a;  XT'®,  yu,  (Ti,  (T2)  = 

P(m  < X)  /i,  Gi,  G2^.  If  the  quantities  P(a;  tt^,  fi,  g\,  G2)  and  P{a\  /i,  gi,  G2) 

are  close  to  1 — a even  if  the  sample  sizes  are  small,  then  there  is  evidence  that  these 
priors  perform  well  with  respect  to  the  probability  matching  criterion. 

We  generate  a variety  of  samples  of  sizes  ni  and  U2  from  M{ij..,g\)  and  jV(/i, cr|) 
distributions.  Throughout,  we  take  yu  = 0 and  cr|  = 1,  but  take  different  values  of  g\. 
The  results  in  Tables  1 and  2 are  reported  for  cr^  = 2 and  g\  — 4. 


44 


P(a;7T^,  ^,(71,(72)  and  P{a-,n^ , fi,ai,a2)  are  estimated  in  the  following  way.  For 
each  pair  (ni,n2),  10,000  random  samples  (aJn, Xi„j)  and  {x2i,  ■■■,  X2u2)  ^^e  gener- 
ated using  FORTRAN??  software.  Next  and  are  computed  for  each 

of  the  10, 000  sets  of  random  samples.  Then  P{a-,  tt^,  //,  cti,  (72)  and  P{a]  /i,  cti,  (T2) 
are  estimated  by  the  proportions  of  times  (ijl)  and  (/x)  are  less  than  or  equal 
to  1 — a. 

It  is  clear  from  the  following  Tables  that  matches  the  target  much  more  ac- 
curately than  7T^.  This  is  intuitively  expected  since  the  former  is  a second  order 
matching  prior,  but  the  latter  is  not. 


3.5  Comparison  with  Other  Frequentist  Methods 


We  introduce  in  this  section  some  existing  frequentist  methods  for  construction 
of  confidence  intervals  for  and  compare  the  performance  of  the  second  order  proba- 
bility matching  prior  against  these  by  analyzing  two  real  datasets.  These  frequentist 
methods  are  all  discussed  in  Yu,  Sun  and  Sinha  (1999). 

Recall  that  Xi  = TJ^UXij/m  and  {m  - l)Sf  = E%i{Xij  ~ ^i)^  i = 

Since 


T-  = 

I — 


(3.1?) 


or,  equivalently. 


Fi  = 


rii{Xi  - nY 

Sf 


Fi,m-i 


(3.18) 


are  standard  test  statistics  for  testing  hypotheses  about  based  on  the  xth  sample, 
suitable  linear  combinations  of  jTi |’s  or  Fj’s  or  other  functions  thereof  have  been  sug- 
gested often  as  pivots  to  construct  exact  confidence  intervals  for  /x.  This  is  precisely 


45 


Table  3.1:  Frequentist  coverage  probabilities  of  0.05  (0.95)  posterior  quantiles  of  // 
when  fx  = 0.0  and  = 1.0  (ni  > U2) 


a't  = 

2.0 

a'f  = 

= 4.0 

ni 

n2 

7T 

0.05 

0.95 

0.05 

0.95 

3 

2 

7T^ 

0.097 

0.908 

0.095 

0.912 

7T^ 

0.050 

0.951 

0.050 

0.950 

7 

5 

7r« 

0.070 

0.934 

0.068 

0.936 

TT^ 

0.054 

0.949 

0.054 

0.949 

15 

10 

7T« 

0.057 

0.945 

0.057 

0.944 

7T^ 

0.051 

0.951 

0.051 

0.949 

20 

15 

7T^ 

0.054 

0.944 

0.054 

0.945 

7T^ 

0.049 

0.949 

0.050 

0.950 

30 

20 

7T^ 

0.056 

0.945 

0.054 

0.945 

7T^ 

0.053 

0.948 

0.051 

0.949 

Table  3.2:  Frequentist  coverage  probabilities  of  0.05  (0.95)  posterior  quantiles  of  /i 
when  n = 0.0  and  = 1.0  (ni  < 712) 


a'l  = 

2.0 

af  = 

-4.0 

ni 

7T 

0.05 

0.95 

0.05 

0.95 

2 

3 

7T« 

0.095 

0.916 

0.086 

0.921 

0.055 

0.949 

0.055 

0.949 

5 

7 

7T« 

0.065 

0.936 

0.061 

0.941 

7T^ 

0.050 

0.951 

0.049 

0.951 

10 

15 

7T« 

0.058 

0.945 

0.056 

0.946 

7T^ 

0.053 

0.950 

0.051 

0.950 

15 

20 

7T^ 

0.055 

0.946 

0.054 

0.945 

7T^ 

0.052 

0.948 

0.050 

0.949 

20 

30 

7T« 

0.055 

0.946 

0.053 

0.946 

TT^ 

0.052 

0.948 

0.050 

0.948 

46 


what  is  accomplished  in  Cohen  and  Sackrowitz  (1984),  and  in  Jordan  and  Krish- 
namoorthy  (1996).  Their  work  is  summarized  below. 


3.5.1  Confidence  Interval  for  /x  Based  on  T^’s 

Cohen  and  Sackrowitz  (1984)  suggested  use  of  Mt  = maxi<i<k{\Ti\}  as  a test 
statistic  for  testing  hypotheses  about  /x.  We  can  use  Mt  to  construct  a confidence 
interval  for  fi  once  the  cutoff  point  of  the  distribution  of  Mt  is  known.  This  cutoff 
point  under  the  null  hypothesis  is  denoted  by  Ca/2-  Thus,  if  Ca/2  satisfies  the  condition 

1-  a = P[Mt  < Ca/2] 

= n ^’11^(1  < c„/2i , 

i=l 

an  exact  confidence  interval  for  //  with  confidence  level  1 — a is  given  by 


max  IXi 
l<i<k  t 


min  \Xi  + 
l<i<k  t 


(^a/2Si  "I 

y/rk  ^ 


(3.19) 


Determination  of  the  cutoff  point  Ca/2  is  not  easy  in  applications,  and  simulation  may 
be  necessary.  An  alternative  approach  is  to  use  the  confidence  interval 


^6)  Q. 


max  IXi ?l!^\  rnin  jXj  + 

i<i<fc  t . /t17  J Ki<k  I .m7  J 


l<i<fc  y/lT'i 


(3.20) 


where  c^*j2  satisfies  F[|Tj|  < = (1  ~ This  latter  interval  clearly  also  has 

an  exact  coverage  probability  \ — a. 

Fairweather  (1972)  suggested  using  a weighted  linear  combinations  of  Tj’s,  namely. 


k 

Wt  = ^UiTi, 
i=l 


lVar(T,)]-‘ 

i:j=i[Var(T,))-' 


(3.21) 


47 


which  is  also  a pivot.  If  ba/2  denotes  the  cutoff  point  of  the  distribution  Wt,  satisfying 
the  equation 


l-a  = P[\Wt\<bo^;2]  , 


then  the  confidence  interval  for  p,  is  obtained  as 

y/^UiXi/ Si  ba/2  IZt=i  \/^UiXi/ Sj 

. ELi  V^iUi/Si  ~ Ell  V^Ui/Si  ’ 


(3.22) 


ba/2 

Ei=l  Si 


(3.23) 


It  may  be  noted  that  m,’s  can  be  easily  derived  using  the  fact  that 

var(T^)  = — ^ , V >2  . 


3.5.2  Confidence  Interval  for  p Based  on  FjS 

Jordan  and  Krishnamoorthy  (1996)  suggested  using  linear  combinations  of  the  Fj’s 
such  as  Wf  = Ef=i  'WiFi  for  positive  weight  iCj’s,  which  is  again  the  pivot.  Hence,  if 
we  can  compute  Oq  such  that 


P[Wf<aa]  = l-a  , 


then,  after  simplification,  an  exact  confidence  interval  for  p with  confidence  level  \ — a 
is  given  by 


(3.24) 


where 

_ Witij/Sf 

WjUj/S] 


and 


48 


2 


Jordan  and  Krishnamoorthy  (1996)  used  Wi  as  inversely  proportional  to  var{Fi)  = 
2m- (mj  - l)/[(mj  - 2)^(mj  - 4)]  where  mi  = rii-  1,  resulting  in  Wi  as 


Of  course,  it  is  assumed  that  Uj  > 5 for  all  the  k populations. 

3.5.3  Confidence  Regions  for  p.  Based  on  PjS 

Since  Fj  defined  in  (3.18)  can  be  used  for  testing  hypotheses  about  /x,  one  defines 
the  xth  P value,  Pj  as 


where  fi{x)  denotes  the  pdf  of  the  F distribution  with  1 and  (nj  - 1)  degrees  of 
freedom.  Recalling  the  fact  that  Pi,  ...,Pk  are  iid  uniformly  distributed  random  vari- 
ables, one  can  combine  them  using  several  well  known  methods  (Hedges  and  Olkin, 
1985).  In  particular,  we  discuss  below  methods  such  as  Tippett’s  (Tippett,  1931), 
Fisher’s  (Fisher,  1932),  inversion  normal  (Stouffer,  Suchman,  DeVinney,  Star  and 
Williams,  1949),  and  logit  (George,  1977)  for  data  analysis,  and  compare  them  with 
the  Bayesian  procedure. 

Tippett’s  Method 

If  P[i]  is  the  minimum  of  Pi,  P2, ...,  Pjt,  then  Tippett’s  method  rejects  the  hypoth- 
esis about  if  P[i]  < Cl  =-l  - (1  - By  inverting  this  rejection  region,  we  have 


[{mi  - 2)2(mj  - 4)]/[mf(mj  - 1)] 


T,'j=i[{mj  - 2)2(mj  - 4)]/[m‘j{mj  - 1)] 


49 


a confidence  interval  for  n with  confidence  coefficient  1 — a,  given  by 
C.I  = : P[i]  > Cl} 

= {/X  : Pi  > ci,i  = 

= Ji{x)dx>l-{l-aY^\i  = l,...,k}.  (3.25) 

It  may  be  noted  that  (3.25)  exactly  coincides  with  (3.20). 

Fisher’s  Method 

Since  Fisher’s  method  rejects  hypotheses  about  /x  when  — 2 log  Pi  > Xlk,a^ 
the  confidence  interval  for  /x  obtained  by  inverting  the  acceptance  region  of  this  test 
is  given  by 


the  1 — q;  level  confidence  interval  for  /x  obtained  by  inverting  this  acceptance  region 
is  given  by 


k 


C.I  = {^‘■■-2'£losP><xka} 


(3.26) 


Inversion  Normal  Method 


(3.27) 


where  $(•)  is  a cdf  of  standard  normal  distribution. 


50 


Logit  Method 


This  method  rejects  Hq  if  EJ=i  log  < c where  c is  a predetermined  constant. 
It  was  shown  by  George  and  Mudholkar  (1977)  that  the  distribution  of 


G*  = 


k p 

Elog  ' 

i=l 


I -Pi 


1/2 


can  be  closely  approximated  by  a standard  normal  distribution.  Therefore  a (1  — a) 
level  confidence  interval  confidence  interval  for  /x  can  be  obtained  from 


C.I  = {fi:G*  < Za} 


(3.28) 


3.5.4  Data  Analysis 

In  order  to  see  and  compare  the  performance  of  the  second  order  probability 
matching  prior  derived  in  Section  3.2  with  the  frequentist  methods  described  in  the 
previous  section,  we  analyze  two  sets  of  real-life  data. 

Example  1 In  an  example  given  by  Snedecor  (1950)  the  data  from  four  experiments 
are  used  to  estimate  the  percentage  of  albumin  in  plasma  protein  of  normal  human 
subjects.  This  dataset  is  also  reported  in  Meier  (1953)  and  is  analyzed  in  Jordan  and 
Krishnamoorthy  (1996).  We  would  like  to  combine  the  results  of  four  experiments  in 
order  to  construct  a confidence  interval  for  the  common  mean  /x.  The  data  appear  in 
Table  3.3. 

We  have  applied  all  the  methods  described  in  Section  5,  and  computed  the  two- 
sided  confidence  intervals  for  /x  with  a = 0.05.  Also,  we  have  computed  the  two- 
sided  confidence  intervals  and  HPD  regions  based  on  the  second  order  probability 
matching  priors  to  compare  its  performance  at  the  same  significance  level.  Figure  3.1 
illustrates  the  marginal  posterior  pdf  of  the  common  mean  /x  corresponding  to  the 


51 


Table  3.3:  Percentage  of  albumin  in  plasma  protein 


Experiment 

rii 

Mean 

Variance 

A 

12 

62.3 

12.986 

B 

15 

60.3 

7.840 

C 

7 

59.5 

33.433 

D 

16 

61.5 

18.513 

second  order  probability  matching  prior.  Table  3.4  gives  the  resulting  equal  two-sided 
confidence  intervals  and  HPD  regions.  It  turns  out  that  the  confidence  intervals  based 
on  Bayesian  procedures  are  comparable  to  the  best  frequentist  procedure  in  the  sense 
of  having  the  smallest  observed  length.  The  best  frequentist  procedure  here  is  the 
one  due  to  Fairweather  (1972)  where  the  confidence  interval  has  length  2.30.  The 
Bayes  confidence  interval  has  length  2.31. 


Table  3.4:  Interval  Estimates  for  jj, 


Methods 

Critical  Values 

Weights 

Interval 

C & S (3.19) 

c=0.3043 

[ 59.14,  62.50  ] 

T (3.20) 

Ci=2.9702,  2.8543, 
3.5055,  2.8272 

[ 59.20,  62.36  ] 

F (3.23) 

b = 1.102 

Ui=0.2550,  0.2671, 
0.2708,0.2701 

[ 59.89,  62.19  ] 

J & K (3.24) 

Fisher  (3.26) 

Inverse  Normal  (3.27) 
Logit  (3.28) 

Prob. Matching  Prior 
Equal  Two-sided 
HPD  region 

a = 3.191 

Pi=0.2100,  0.5245, 
0.0181,  0.2474 

[ 59.56,  62.44  ] 

[ 59.47,  62.56  ] 
[ 59.27,  62.66  ] 
[ 59.29,  62.61  ] 

[ 59.85,  62.16  ] 
[ 59.86,  62.17  ] 

52 


Albumin_Data 


Figure  3.1:  Posterior  pdf  of  n 

Example  2 This  example  appears  in  Eberhardt,  Reeve  and  Spiegelman  (1989)  and 
deals  with  the  problem  of  estimation  of  mean  Selenium  in  non-fat  milk  powder  by 
combining  the  results  of  four  methods.  The  data  appear  in  Table  3.5. 


Table  3.5:  Selenium  in  non-fat  milk  powder 


Methods 

Ui 

Mean 

Variance 

Atomic  absorption  spectrometry 

8 

105.00 

85.711 

Neutron  activation 

1)  Instrumental 

12 

109.75 

20.748 

2)  Radiochemical 

14 

109.50 

2.729 

Isotope  dilution  mass  spectrometry 

8 

113.25 

33.640 

Here  again  we  have  applied  all  the  methods  described  in  Section  5,  and  have  com- 
puted the  two-sided  confidence  intervals  for  /i  with  a = 0.05.  Also,  we  have  computed 


53 


the  two-sided  confidence  intervals  and  HPD  regions  based  on  the  second-order  prob- 
ability matching  priors  to  compare  the  performance  at  the  same  significance  level. 
The  marginal  posterior  pdf  of  the  common  mean  //  corresponding  to  the  second  order 
probability  matching  prior  appears  in  Figure  3.2.  Table  3.6  gives  the  resulting  equal 
two-sided  confidence  intervals  and  HPD  regions.  As  a result,  the  confidence  intervals 
based  on  Bayesian  procedures  seems  to  be  near  equivalent  to  the  best  frequentist 
procedure  based  on  inverse  normal,  but  it  is  superior  to  other  frequentist  procedures. 
Also,  unlike  the  previous  example  the  Fairweather  interval  does  not  have  the  shortest 
length  here. 


Table  3.6:  Interval  Estimates  for  /i 


Methods 

Critical  Values 

Weights 

Interval 

C & S (3.19) 
T (3.20) 

c=3.1283 
Ci=3.321,  2.970, 
2.886,  3.321 

[ 108.12,  110.88  ] 
[ 108.23,  110.77  ] 

F (3.23) 

b = 1.118 

Uj=0.2309,  0.2645, 
0.2736,0.2309 

[ 108.59,  110.81  ] 

J & K (3.24) 

Fisher  (3.26) 

Inverse  Normal  (3.27) 
Logit  (3.28) 

Prob. Matching  Prior 
Equal  Two-sided 
HPD  region 

a = 3.341 

Pi=0.0068,  0.0777, 
0.8908,  0.0247 

[ 108.52,  110.68  ] 

[ 108.59,  110.61  ] 
[ 108.78,  110.47  ] 
[ 108.72,  110.51  ] 

[ 108.69,  110.48  ] 
[ 108.71,  110.49  ] 

3.6  Concluding  Remarks 

The  chapter  presents  yet  another  example  where  a unified  Bayesian  procedure  can 
lead  to  good  frequentist  properties  in  situations  where  there  is  no  clear-cut  choice 


54 


between  frequentist  procedures.  Yu,  Sun  and  Sinha  (1999)  have  recommended  the 
use  of  Fisher’s  method  for  finding  confidence  intervals  for  the  common  mean  based 
primarily  on  pragmatic  considerations.  In  both  the  examples  considered,  the  proposed 
Bayesian  credible  interval  has  smaller  lengths  than  the  interval  based  on  Fisher’s 
method.  Thus,  the  Bayesian  procedure  should  be  given  every  serious  consideration 
even  for  frequentist  analysis. 


CHAPTER  4 

BAYESIAN  INFERENCE  FOR  THE  RATIOS  OF  REGRESSION 
COEFFICIENTS  IN  A LINEAR  MODEL 


4.1  Introduction 


There  is  a large  class  of  important  statistical  problems  which  can  be  categorized 


The  objective  of  this  chapter  is  to  present  a unified  default  Bayesian  analysis  for 
the  general  problem  involving  ratios  of  regression  coefficients.  To  this  end,  once  again 
an  orthogonal  transformation  of  the  parameter  vector  is  proposed.  This  orthogonal 
transformation  becomes  a convenient  starting  point  in  the  development  of  a variety 
of  default  Bayesian  priors. 

The  general  regression  model  is  given  by 


where  the  parameter  of  interest  is  p = In  matrix  notation,  we  write  the  model 

as 


under  the  general  heading  of  inference  about  the  ratio  of  two  regression  coefficients 
in  a general  linear  model.  Included  in  this  class  are  (i)  the  calibration  problem,  (ii) 
the  Fieller-Creasy  problem,  (iii) parallel-line  bioassay,  and  (iv)  slope-ratio  bioassay. 


r 


^ ^ Pj^ij  T Cj  , (i  !)•••)  n.) 


(4.1) 


Y = X/3  + e 


(4.2) 


where 


55 


56 


f Xn  ••• 

Xir  \ 

\ Xn\ 

Xfir  ) 

(ei, 

, and 

We  also  assume  that  e ~ ^"(0,  . 

In  Section  4.2,  we  introduce  some  special  examples  of  important  statistical  prob- 
lems which  all  arise  as  special  cases  of  the  general  problem  discussed  above.  This 
observation  is  not  new.  For  instance,  these  models  have  all  been  previously  recog- 
nized as  special  cases  of  the  general  model  by  Buonaccorsi  and  Gatsonis  (1988). 

The  new  feature  of  this  chapter  is  the  introduction  of  a nontrivial  orthogonal 
transformation  of  the  original  parameter  vector  (/9,  cr)  in  Section  4.3  which  facilitates 
systematic  development  of  default  Bayesian  priors,  such  as  the  different  reference 
priors  and  probability  matching  priors.  In  this  process,  some  of  the  priors  proposed 
for  example  in  Liseo  (1993),  Yin  and  Ghosh  (1997),  Hunter  and  Lamboy  (1981), 
Ghosh,  Carlin  and  Srivastava(1995),  Mendoza  (1996)  and  many  others  are  formed  as 
special  cases,  and  more  important,  some  new  priors  are  developed. 

The  reference  priors  and  probability  matching  priors  are  developed  in  Section 
4.4.  Section  4.5  proves  the  propriety  of  posteriors  under  these  default  priors  with 
very  mild  assumptions  on  the  sample  sizes.  Two  real  examples  of  special  cases, 
parallel-line  bioassay  and  slope-ratio  bioassay,  are  discussed  in  Section  4.6.  Section 
4.7  provides  mathematical  proofs  of  theorems. 


57 


4.2  Some  Special  Examples 
4.2.1  The  Linear  Calibration  Problem 

Statistical  calibration  problems  arise  in  the  presence  of  two  measurement  meth- 
ods and  involves  making  inferences  about  fixed  but  unknown  explanatory  variables 
corresponding  to  a response  variable.  Since  the  apperarance  of  Krutchkoff’s  (1967) 
controversial  paper,  a large  number  of  papers  have  appeared  on  the  topic.  Univariate 
Bayesian  calibration  has  been  discussed  by  Hoadley  (1970)  and  Hunter  and  Lamboy 
(1981).  Hoadley  (1970)  showed  that  Krutchkoff’s  (1967)  inverse  regression  estimator 
can  be  interpreted  in  a Bayesian  way.  Hunter  and  Lamboy  (1981),  on  the  other  hand, 
tended  to  provide  Bayesian  support  of  the  classical  estimator.  The  prior  of  Hunter 
and  Lamboy  (1981)  was  criticized  both  by  Hill  (1981)  and  Lawless  (1981).  In  particu- 
lar, Hill  (1981)  criticized  the  authors  for  introducing  a prior  “with  no  motivation  and 
no  attempt  to  understand  either  its  implementations  or  its  relationship  with  other 
prior  distributions” . We  will  see,  however,  that  the  Hunter-Lamboy  prior  is  indeed  a 
second  order  probability  matching  prior. 

Ghosh,  Carlin  and  Srivastava  (1995)  discussed  a Bayesian  approach  to  univariate 
calibration  with  the  reference  priors  and  the  first  order  probability  matching  priors. 
They  show  that  Jeffreys  noninformative  prior  as  well  as  the  one  used  by  Hunter  and 
Lamboy  (1981)  are  first  order  probability  matching  priors. 

In  a univariate  linear  calibration  problem,  the  calibration  experiment  and  the 
prediction  experiment  can  be  represented  as 

Zii  = O'  4-  /3lCj  €ii  (i  = 1,  ...,  TTl) 

Z2j  = a-\-^p-\-e2j  (i  = l)---7^)  , 


(4.3) 


58 


respectively,  where  €ii  and  e2j  are  i.i.d  A/’(0,(T^).  This  is  a special  case  of  the  model 
given  in  (4.1)  with  n = m + A:,  r — 3,  Yi  = Zu{i  = 

Also,  Pi  = Pp,  /?2,  = P,  Pz  = ct  and  the  design  matrix  X is  given  by 


0 ...  0 1 •••  1' 

Xi  Xm  0 • • • 0 

1 ...  1 1 ...  1 


(4.4) 


4.2.2  The  Fieller-Creasy  Problem 

The  celebrated  Fieller-Creasy  problem  (Fieller,  1954;  Creasy,  1954)  involves  in- 
ference about  the  ratio  of  two  normal  means.  This  problem  has  posed  a constant 
challenge  to  frequentist  and  likelihood  based  inference.  Fieller’s  method  of  providing 
a confidence  set  for  this  ratio  based  on  a pivot  can  lead  to  two  disjoint  unbounded 
sets  or  even  the  whole  real  line.  Also,  as  pointed  out  by  Gleser  and  Hwang  (1987) 
(see  also  Berger,  Liseo  and  Wolpert,  1996)  based  on  any  sample  of  arbitrary  but 
fixed  size  n,  a confidence  interval  of  finite  expected  length  for  this  ratio  has  coverage 
probabilitiy  (taking  the  inf.  over  all  points  in  the  parameter  space)  of  0. 

Bayesian  analysis  for  this  problem  based  on  noninformative  priors  began  with 
Kappenman,  Geisser  and  Antle  (1970),  and  was  addressed  subsequently  in  Bernardo 
(1977),  Sendra  (1982),  Mendoza  (1988,  1996),  Stephens  and  Smith  (1992),  Liseo 
(1993),  Phillipe  and  Robert  (1998),  and  Berger,  Liseo  and  Wolpert  (1996). 

Liseo  (1993)  compared  profile  likelihood  and  its  modification  with  a Bayesian  anal- 
ysis based  on  reference  priors,  when  the  variance  is  known.  Mendoza  (1996)  investi- 
gated frequentist  coverage  probabilities  of  HPD  intervals  for  reference  priors  through 
simulations  with  known  variance.  In  general,  integrated  likelihood  approaches  with 
noninformative  priors  are  advocated  by  Berger,  Liseo  and  Wolpert  (1996)  for  tackling 
a class  of  important  but  troublesome  problems,  including  the  Fieller-Creasy  problem. 


59 


The  assumed  model  for  the  Fieller-Creasy  problem  is  given  by 


Zii  = ^ + ^ii  = 

Z2j  = Pp  + ^2j  (i  — ) (4-5) 

where  eu  and  C2j  are  i-i.d  J\f{0,a^).  This  is  once  again  a special  case  of  the  model 
given  in  (4.1)  with  n = m + k,  r = 2,  Yi  = Zu{i  — Ym+j  = = 

1, fc),  /3i  = /5p,  ^2=^  ^ ■ The  design  matrix  X is  given  by 


■Q  •••  0 1 •••  1 

1 ...  1 0 •••  0 


(4.6) 


4.2.3  Parallel-Line  Bioassay  Problem 

Consider  a test  drug  T being  compared  to  a standard  drug  S.  T and  S are  said 
to  be  similar  if  there  is  a constant  p > 0 such  that  the  distribution  of  the  response  to 
dose  W2  of  the  test  drug  is  equivalent  to  that  of  dose  w\  = pw2  of  the  standard  drug. 
When  the  drugs  are  similar,  p is  called  the  relative  potency  of  T to  S. 

In  many  dose-response  experiments  the  quantitative  response  Z is  linearly  related 
to  the  log  dose,  x = log  w,  and  we  can  assume  the  model 

E[Z\x]  = a + ^x  (4.7) 

where  a and  /3  are  unknown  real-valued  parameters.  If  S and  T are  similar,  then  the 
responses  Zi  and  Z2  to  S and  T,  respectively,  are  related  by 

£■[^110;]  = a + /3x 

E[Z2\x]  = a + p{x  + p),  (4.8) 


60 


for  some  a,  ^ and  p,  where  p = log  p.  This  follows  from  the  assumption  that  dose  w 
of  T is  equivalent  to  dose  ptn  of  S.  The  estimation  of  p (or  equivalently  p)  based  on 
this  model  is  called  a parallel-line  bioassay. 

Given  the  intercepts  a\  — a and  a2  = a -H  /3  log  p and  the  common  slope  /3, 
we  may  observe  the  horizontal  distance  between  the  two  regression  lines  in  (4.8)  is 
p = logp  = {a2  - Q!i)//3. 

Consider  an  experiment  where  p doses  (xn, ...,  Xip)  of  a standard  drug  S is  assayed 
m times  and  q doses  (^C2i)  •••)  ^2g)  of  ^ tost  drug  T is  assayed  u times  so  that  a set 
{Zuk,i  = l,...,p;A:  = l,...,m-,Z2jk,j  = = l,...,u}  of  pm  + qu  observations 

are  obtained.  The  assumed  model  here  is 

Ziik  = ct  + /3xii  + euk,  (fc  = 1, ...,  m;  i = 1, ..., p) 

^2jk  — O!  + ^(^2j  + p)  + [k  = 1,  j = 1,  ...,q)  , (4-9) 

where  euk  and  e2jk  are  i.i.d  J\f{0,cr'^).  Once  again,  p is  the  parameter  of  interest. 

In  order  to  recognize  this  model  as  a special  case  of  (4.1),  we  first  represent 
the  ZiikS  and  ^2jfc’s  as  the  Y vector  as  in  the  previous  example.  Also,  as  before 
n = pm  + qu,  r = 3,  /3i  = ^p,  ^2  = 0 and  ^3  = a.  The  design  matrix  X in  this  case 
is  given  by 


r 0 •••  0 


xn---  Xu 


1 •••  1 


0 ...  0 1 •••  1 

X\p  * ' * X\p  X21  ' ' * 3^21 

1 ...  1 1 ...  1 


1 ...  1 - 

^2q  ' ■ ' ^2q 

1 •••  1 


(4.10) 


4.2.4  Slope-Ratio  Bioassay  Problem 

Estimation  of  slope-ratio  is  a common  problem  in  biological  assays.  Consider 
an  experiment  where  p doses  (xn, ...,  Xip)  of  a standard  drug  S is  assayed  m times 


61 


and  q doses  {x2i,  of  a test  drug  T is  assayed  u times  so  that  a set  {Zuk,i  = 

oipm  + qu  observations  are  ob- 
tained. 

The  classical  approach  to  the  statistical  analysis  of  these  ratio-type  problems  has 
the  common  drawbacks  as  given  in  Fieller  (1954),  Creasy  (1954).  On  the  other  hand, 
likelihood  based  approach  will  also  encounter  difficulties  in  producing  interval  estima- 
tors. Mendoza  (1990)  discussed  Bayesian  analysis  for  the  slope-ratio  problem.  Under 
the  normality  assumption,  he  derived  the  two-group  reference  prior  and  compared  the 
corresponding  credible  intervals  with  the  confidence  intervals  via  Fieller’s  theorem. 

The  assumed  model  for  slope-ratio  bioassay  is 

Ziik  = oi  + /3xii  + eiik  (fc  = 1, ...,  m;  i = 1, ...,  p) 

Z2jk  = Oi P pX2j  (-2jk  {k  = \,  j = 1,  ....,q)  (4-11) 

where  tuk  and  t2jk  are  i.i.d  A/^(0,  a^).  To  see  how  this  problem  arises  as  a special  case 
of  the  general  regression  model  given  in  (4.1),  let 


— -^111) 

Y2  — Z112, 

■ ) Yjfi  — 

= 

Ym+2  = Z122, 

■ ) Y2m  ~ Z\2m 

Z[p-l)m+l 

= Zlpl, 

A(p— l)m+2  Z\p2i 

■ ) YpifYi  = Z\pfn 

Zpm-i-1 

II 

Ypm+2  ^212) 

■ ■>  ^m+u  Z2IU 

^m-|-(9— l)u+l 

— -^29!) 

Y(p+q—l)n+2  ^2q2i 

■ > ^m+9U  — Z2qu 

We  can  then  identify  this  model  as  a special  case  of  the  general  regression  model  with 
n^pm^-qu,  r ^ 3 and  /3i  = /3p,  ^2  = = « • Also,  the  design  matrix  X is 


62 


given  by 


0 • 
a^ii  • 

1 • 


••  0 
• • Xu 
1 


0 • • • 0 a;21  • • • a:21  • • • X2q  ■ • • X2q 

Xip  ■ Xip  0 ■ • • 0 • • ■ 0 • • • 0 

1 ...  1 1 ...  1 ...  1 •••  1 


(4.12) 


4.3  The  Orthogonal  Transformation 

This  section  introduces  a transformation  of  the  parameter  vector  (/3,(t)  which 
results  in  the  orthogonality  (Cox  and  Reid,  1987)  of  p with  the  remaining  parameters. 
We  begin  with  the  Fisher  information  matrix 

0^- 
0 2n 

where  Sji  = Yl'l=iXijXii  {j,l  = 1,  ...,r).  Consider  the  transformation 


/(/3,a)  = 


A = 
= 

a = 


Oj  - d2gj{di)  , 3 = 3,  ...,r 
9r+l 


Then  the  Jacobian  matrix  is  given  by 

■02{^ih'(0i)  + h(0i)}  92h'{9,) 

9ih{9^)  h{9i) 

0 0 


J = 


0 

0 


0 

0 


-^253(^1) 

-53(^1) 

1 

0 

0 


-929'ri9i)  0 
-gr{9l)  0 

0 0 


1 

0 


63 


Theorem  4.1  Let 


) i — 3,  ...jr 


where 


^31 

O32 

S33  • 

^3r 

-1 

'Sl3 

^23 

. ®rl 

Or2  . 

\ 

CO 

Srr  . 

_ ^Ir 

S2r  _ 

Then  parametric  orthogonality  holds  by  choosing 

h{9i)  = [ci\6\  + 2ci2^i  + C22]  2 

where 

r 

Cll  = Sll  — ^ 
r 

C12  = S12  — ^ 51^0^2 

j=3 

r 

C22  = S22  — 51  ^2j%2  • 

j=3 

The  proof  of  this  theorem  is  deferred  to  the  appendix. 


The  resulting  Fisher  information  matrix  is  then  given  by 


g^(cilC22-Cl2)^  f)  f) 

(cil9j+2ci2fll+C22)^ 


1(0)  = 


0 

0 

0 

0 


1 0 

0 S33 

0 57*3 

0 0 


0 0 

0 0 

S3r  0 

Srr  0 
0 2n 


(4.13) 


(4.14) 


64 


4.4  Development  of  Noninformative  Priors 


From  (4.2)  the  likelihood  function  is  given  by 


(4.15) 


We  now  consider  several  noninformative  priors  similar  to  the  ones  given  in  the  pre- 
vious chapters. 

4.4.1  Jeffreys’  Prior 

Jeffreys’  prior  is  proportional  to  the  positive  square  root  of  the  determinant  of 
the  I{0).  Thus, 


4.4.2  Reference  Priors 

First  we  consider  the  two-group  reference  prior.  The  parameter  space  is  parti- 
tioned into  {0i},  {02,  being  the  patameter  of  interest.  Let  0(i)  = 0i  and 

0(2)  = (02,  •••,0r+i)'  Then  the  Fisher  information  matrix  can  be  expressed  as 


r+l 


(4.16) 


m = Block  Diagonal (hi{0),h2{0)^  , 


where 


^ "k  2ci20i  + 022)^ 


0i(CllC22  - C?2) 


and 


/j2(0)  = 9^^^  Block  Diagonal(l,{{sij)),2n^  ,i,j  = 3,  ...,r  . 


65 


Hence,  from  Theorem  1 of  Datta  and  M. Ghosh  (1995),  the  two-group  reference  prior 
is  given  by 

7T^^{0)  OC  — icii9\  + 2Ci20i  -f  C22^  • (4-17) 

Next  we  consider  the  three-group  reference  prior.  The  parameter  space  is  par- 
titioned into  {6'i},{6>2},{6'3,...6*r+i},  9\  being  still  the  patameter  of  interest.  Let 
= 6/^,  0(2)  = (^2)  and  0(3)  = (6>3,  ...,0r+i)-  Then  the  Fisher  information  matrix 
can  be  rewritten  as 

J(0)  = Block  Diagonal(^hi{0),h2{O),h3{0)^  , 

where 


h (fi\  - ^2(^11^22  - cl^) 

~ 0.V(Cu0?  + 2Ci2^1+C22)2  ’ 

h2{0)  = , and 

h3{0)  = Block  Diagonal(^{{sji)),2nj  , j,l  = 3,  ...,r  . 

From  Theorem  1 of  Datta  and  M. Ghosh  (1995)  and  following  similar  steps  to  the 
previous  case  , the  three-group  reference  prior  is  given  by 

7T^^(0)  OC  ::y  + 2ci2^i  + C22^  • (4-18) 

^r+l  ' 

A second  three  group  reference  prior  with  the  partition  {^i},  {02,  {0r+i}  leads 

to  the  three-group  reference  prior  9~h{cu9l  + 2ci20i  + 022)“^- 

Finally,  the  one-at-a-time  reference  prior  with  the  partition  {0i},  {02},  •••,  {0r},  {0r+i} 
where  0i  is  the  parameter  of  interest,  while  the  remaining  parameters  are  considered 
in  any  arbitrary  order,  is  given  by  the  same  reference  prior  0^^i  (cii0i + 2ci20i -I-C22) 


66 


4.4.3  Probability  Matching  Priors 

Due  to  the  orthogonality  of  9i  with  (02,  •••,  ^r+i)  (c/-  Cox  and  Reid,  1987),  from 
Tibshirani  (1989)  , the  class  of  first  order  probability  matching  priors  is  characterized 


where  q is  any  arbitrary  positive- valued  function  differentiable  in  its  arguments.  Jef- 
freys’ prior  is  a first  order  probability  matching  prior  and  the  other  reference  priors 
are  all  first  order  probability  matching  priors.  Indeed,  there  are  infinitely  many  first 
order  probability  matching  prior.  To  narrow  down  the  selection  of  priors  within  this 
class  , we  now  consider  second  order  probability  matching  priors. 

From  Mukerjee  and  Ghosh  (1997),  due  to  the  orthogonality  of  0i  with  (02,  •••,  ^r+i), 
this  class  of  priors  is  characterized  as  solutions  of  the  second  partial  differential  equa- 
tion 


(4.19) 


where  = 0.  Also, 


r 


r 


X 02[h'(0i){0ia:ii  + Xi2  — Xij{aji9i  -I-  aj2)}  + h{9i)(xn  — ^2 


For  fc  = 3, ...,  r. 


log  L 

39x39 k 


67 


^2  ■ 


h'  {6i){0i{Skl  — ^kjO>jl)  + Sfc2 

j=3 


r T 

y~^  Sfcj%2}  + h{^i){ski  — XI 

j=3  j=3 


}] 


= • 0 + 0}  + /i(^i)  • 0 

^r+l 


= 0 


by  the  definition  of  Ski  and  Sk2  for  k — 3,  ...,r.  Hence 

d^^ogL_ 
deidOk  ’ 

Thus, 

r9MogLi  _ 

Also,  since 

9^ log  L 2 9 log  L 

dO^dOr+i  ~ Or+l  901  ’ 

and 

9^  log  L 2 9^  log  L 

90?90,+i  “ 0r+i  90?  ’ 

one  has. 

rsnogt,  _ 2 

^llr+1  ^[QeldOr+X^  0.+1  “ 

Next 

d'^  log  L 
5^1 902 

1 " r 

Y.\[yi~  ^2h{9i){9iXix  + Xi2~Yl  ^ij{aji9i  + 0^2)} 

^r+1  i=i  '•  j=3 

x[/i'(6>i){6>ij;ji  + Xi2 


r r 

j=3  j=3 


f X { [^1^*1  + ^*2 

^r+1  i=l 


X)a;ij(aji6>i  + aj2)] 
j=3 


68 


x[/i'(0i){0iXii  + Xi2  — '^Xij{aj\9i  + cij^)}  + h{6i){xii  — y~l  Xj^aji 

j=3  j=3 


After  some  algebra,  we  get 


11 


Lu2  = ^ 


Hence,  (4.20)  simplifies  to 


r+l  Q 9 


v=2 


7r+l 


= 0 (4.21) 


This  simplifies  to 

^[^r+1^9(^2,  ■■■,9r+l) 

du2  *■  02 

or  equivalently 


+ 


d M 


d0r+\ 2n 


q{92i  .■.,9r+i)  —0  (4.22) 


Clearly  g(^2,  ^r+i)  oc  1 provides  a solution  to  (4.22)  . Thus,  a second  order  proba- 

bility matching  prior  is  given  by 


n^{0)  oc  I ( Cii9l  + 2ci2^i  + C22 


9.. 


r+l 


(4.23) 


4.5  Propriety  of  Posteriors 

In  this  section,  we  prove  the  propriety  of  posteriors  under  the  priors  developed  in 
the  previous  section. 


Theorem  4.2  Under  the  class  of  priors  tx{9)  a 9^^-^Q  ^(6>i),  the  marginal  posterior 
distribution  of  9i  is  proper  when  n + a > r -I-  1,  and  is  given  by 


69 


7r(0ily)ocQ  {0i)i^SSE  + 1 


2 . n+g  — r+l 


where 


Qi^i)  — (cii^i  + 2ci2^i  + C22)  , 


SSE  — ^^{Vi  ~ Pl^il  ^2^i2  ' ■ ■ Pr^ir)  t 

i=l 


and 


C = 


Cll  C12 
C12  C22 . 

The  proof  of  this  theorem  is  deferred  to  the  appendix. 


Remark  4.1  This  class  of  priors  includes  the  two-group,  three-group  and  one-at-a- 
time  reference  priors  with  a = r,  r — 1 and  1 , respectively. 


Theorem  4.3  Under  the  class  of  priors  n{d)  oc  > fhe  marginal  posterior 

distribution  of  0\  is  proper  when  n -f  a > r -I-  3,  and  is  given  by 


J—lssE  + + Q{e,)u\ei 

QhOin 


n+q  — r-|-3 


-k 


2a)  (^i)  I 


SSE 


Q{0i) 

\c\02e,-Pif\-^  dz 


Q(0 


r 

1)  J Jo 


. n+g-rfl  ) 

(1  + 2^)  2 


o;(6>i)  = 


+ C12S2)  + C\2^\  + C-22^2 

QiGi) 


where 


70 


A 


SSB  + 


Q(0i) 


The  proof  of  this  theorem  is  also  deferred  to  the  appendix. 


Remark  4.2  In  Theorem  4.3  , a = r + 1 and  1 correspond  to  the  Jeffreys’  prior 
and  second  order  probability  matching  prior,  respectively. 

Remark  4.3  The  prior  used  by  Hunter  and  Lamboy  (1981)  for  linear  calibration 
problem,  oc  |/5|cr“^  , is  a second  order  probability  matching  prior.  The  general 
class  of  first  order  probability  matching  priors  for  linear  calibration  problem  was  also 
derived  in  Ghosh,  Carlin  and  Srivastava  (1995). 

Remark  4.4  Ghosh  and  Yin  (2000)  derived  probability  matching  priors  and  refer- 
ence priors  for  the  generalized  Fieller-Creasy  problem  which  involved  inference  about 
the  ratio  of  two  location  parameters  for  two  independent  symmetric  location-scale 
distribution. 

Remark  4.5  The  two-group  reference  prior  was  derived  by  Mendoza  (1990)  for  slope- 
ratio  bioassay  problem  in  the  normal  case. 

Remark  4.6  Kim,  Carter  and  Hubert  (1991)  derived  a closed-form  point  and  interval 
estimator  in  a symmetric  parallel-line  bioassay. 

Remark  4.7  Buonaccorsi  and  Gatsonis  (1988)  provided  a Bayesian  solution  to  the 
problem  of  inference  about  the  ratio  of  two  coefficients  in  a linear  model.  But,  they 
did  not  provide  a way  of  choosing  the  function  h(-)  . 


Now,  the  resulting  posteriors  for  these  special  cases  are  characterized  by  finding 
the  elements  Cn,  Ci2  and  C22  of  C using  Theorem  4.2  and  4.3.  Table  4.1  gives 
the  all  elements  of  C obtained  from  the  design  matrix  X for  the  special  cases. 


71 


Table  4.1:  Catalog  of  the  elements  of  C 


Cases 

Model 

Cii)  Ci2  and  C22 

Linear 

Calibration 

Problem 

(4.3) 

_ mk 
‘'ll  “ m+fc 

C12=  m\k^T=.^i 

Fieller 

Creasy 

Problem 

(4.5) 

Cii  = k 
C12  = 0 

C22  = m 

Slope- Ratio 
Bioassay 
Problem 

(4.7) 

Cii=wE|=i4  pm+5«[ 
^12= 

c22-mELi4 

Ej=l 

E,Ei 

Parallel-Line 

Bioassay 

Problem 

(4.11) 

■pmqu__ 

11  pm+qu 

Cu-uEU^y  FsSiS 

C22  = rn  ELi  + 11  Ej= 

ELi  + Ej=l  ^2j] 

1 pm\qu  h ^^=1 

72 


4.6  Numerical  Examples 

In  order  to  see  the  performance  of  the  noninformative  priors  derived  in  Section 
4.4,  we  analyze  two  sets  of  data  using  Jeffreys’  prior,  the  reference  priors  and  second 
order  probability  matching  prior. 

Example  1 Parallel-Line  Bioassay  Problem 

Finney  (1978,  pl05)  analyzed  the  data  in  Table  4.2  from  turbidimetric  measurements 
on  the  growth  response  of  Lactobacillus  leichmannii  to  vitamin  B\2  which  is  reported 
by  Emery  et  al.(1951). 


Table  4.2:  Responses  in  an  assay  of  vitamin  B12 


Stimulus 

Dose 

Standard 
-1.0  0.0  1.0 

-1.0 

Test 

0.0 

1.0 

0.96 

1.06 

1.17 

0.91 

1.09 

1.15 

0.91 

1.07 

1.14 

0.93 

1.04 

1.15 

0.92 

0.99 

1.14 

0.98 

0.97 

1.14 

0.76 

0.86 

1.13 

0.96 

1.06 

1.16 

1.03 

1.06 

1.13 

0.89 

1.04 

1.10 

0.93 

1.02 

1.15 

1.01 

1,02 

1.15 

In  this  case,  n = 36,  r = 3,p  = 3, 5 = 3,  SSE  = 0.0961  and 


■/?r 

'Pp' 

'0.0178' 

^2 

= 

= 

0.105 

.-^3. 

. ^ 

1.0238 

'cii 

C12' 

■9 

0 ■ 

T12 

C22 . 

_0 

24. 

Finney  obtained  Fieller’s  95%  confidence  interval  for  this  data  as  (—0.193,0.551) 
and  the  length  of  interval  is  0.744.  Table  4.3  shows  that  the  noninformatives  priors 
performed  well  with  shorter  intervals  than  Finney’s. 


73 


Table  4.3:  Posterior  quantiles 


Fq.025 

-Po.975 

Length 

7T^ 

-0.165 

0.523 

0.688 

^IR 

-0.187 

0.536 

0.723 

^2R 

-0.176 

0.523 

0.699 

-0.181 

0.529 

0.710 

7T^ 

-0.182 

0.541 

0.723 

Example  2.  Slope-Ratio  Bioassay  Problem 

The  data  in  Table  4.4  were  reported  by  Wood  (1946),  who  used  20  tubes,  4 each  for 
blanks,  0.1,0.2/xp  of  standard  riboflavin  and  0.025,0.5^  malt  per  tube.  Responses 
were  measured  to  the  nearest  0.05m/  : for  arithmetic  convenience,  this  may  be  taken 
as  the  unit  of  responses,  so  that  all  values  of  y are  integers.  It  was  analyzed  by  Finney 
(1978,  pl61)  using  the  classical  approach. 


Table  4.4:  Responses  in  an  assay  of  riboflavin  in  malt 


Stimulus 

Standard 

Test 

Dose 

0.0 

0.5 

1.0 

0.5 

1.0 

38 

97 

167 

80 

121 

45 

100 

164 

88 

124 

40 

105 

159 

90 

122 

44 

98 

156 

82 

122 

In  this  case,  n = 20,  r = 3,  p = 3, 


'3r 

'0p' 

■ 81.2288  ■ 

3 = 

32 

= 

0 

-- 

118.6285 

.03. 

. 2 

42.1429  . 

= 2,  SSE  = 252.8857  and 


'Cll 

Cl2 

■ 3.2 

-1.8' 

.Cl2 

C22  . 

_-1.8 

3.2  . 

74 


For  this  dataset,  the  95%  confidence  interval  obtained  by  Finney  was  (0.6464, 0.7235) 
and  the  length  of  interval  is  0.0771.  On  the  other  hand,  from  Table  4.5  we  can  see 
that  the  0.025th  and  0.975th  percent  quantile  points  corresponding  to  Jeffreys  prior 
one-at-a-time  reference  prior  two-group  reference  prior  three-group 

reference  prior  and  second  order  probability  matching  prior  {n^)  are  similar  or 

shorter  than  Finney’s  result. 


Table  4.5;  Posterior  quantiles 


Fq.025 

To.975 

Length 

TT-^ 

0.6504 

0.7195 

0.0691 

^IR 

0.6460 

0.7241 

0.0781 

^2R 

0.6486 

0.7218 

0.0732 

^ZR 

0.6473 

0.7227 

0.0754 

7T^ 

0.6469 

0.7227 

0.0758 

4.7  Appendix 


Proof  of  Theorem  4.1  Let  I{0)  = . Then 

= -^^\sn{0\h'{di)  + h{9i)}  + Si2h'{9i) — 

“r+l 


1=3 


(J/)i2  = -^^[si2{9ih'{9i)  + h{9i)}  + S22h'{9i) — ^S2jgj{9i)^ 

“r+l  j=3 


{JI)u  = -^^\su{9ih'{9i)  + h{9i)}  + S2ih'{9i)  — ^ Sijgj{9i)  , i — 3,  ...,r 

^r+l  j=3 


0 


75 


Hence,  if 


'93{9iY 

*33  • 

• *3r 

-1 

*13  *23 

r 

M9i). 

. *3r 

Sfr 

. *lr  *2r  . 

eih'{ei)  + h{ei) 

h'{e,) 


then,  one  has  {JI)u  = Q,  i — 3,  So  pick 


53(^1) 

*33  • 

*3r 

-1 

^13  ^23 

'9,h{9i)' 

.9r{9l). 

. *3r 

■ *rr  . 

__  5xr  ^2r  . 

. h{9i)  . 

<i31  ^32 


9M^i) 

[ He,) 


_ (^rl  ®r2  . 

^*31^1  + ^32 


= HOi) 


drlOi  + flr2  _ 


Now,  since  I{0)  — JI{l3,a)j'^ , one  has 

(/(9))^.  = (j/)_.  = 0 , i = 3,..,r,r  + l 


Similarly, 


h(»)h.  = M.*  = “ ' i = 3.  + 1 


Note  that 

{m)u  = 

02Hei) 


12 


91 


r+l 


+ 2si2^ih  (0i)  9,  ^ Sijgj{9i) 

j=3 


+Si2h(0i)  + S22h' {9,)  — ^2  ^2j9j{9i) 

J=3 


92h{9i) 

9hi 


Sn9i{9,h'  {9,)  + h{9,)}  + 2si20ih^(^i) 


76 


Sij{aji[6ih' {6i)  + h{9i)]  + aj2h'{9i)} 

j=3 

+Si2h{9i)  + S22h'{9i)  — ^ S2j{aji[9ih' {9i)  + h{9i)]  + aj2h'{9i)} 

j=3 

We  need  /12  = 0 to  satisfy  the  condition  of  orthogonality.  Thus,  we  require 


0 


h'{9,) 


T r r 

^ sijOji)  + ^i(2si2  ~ ^ S\jaj2  — ^ S2jaji) 
j=3  j=3  j=3 


+ S22  — XI  *2j«j2 

j=3 


+ h(^i) 


^1(^11  ~ X ^12 

j=3 


X ^2jOji 

j=3 


Also  1 since 


T -1  r 


r 

X *lj%’2  — (^13  • • • Sir) 

S33  ■ 

^3r 

S23 

j=3 

r 

X ®2jOjl  = (S23  • • • S2r) 

. S3r 

S33  • 

Srr  , 
• S3r' 

-1 

S2r. 

’S13' 

i=3 

. S3r 

Sj-r  . 

. Sir  . 

Hence,  one  needs  to  find  h(0i)  from 


0 


/l'(0l)  — X + 20i(Si2 


j=3 


X ^ij“j2)  + S22  - X S2jdj2 

j=3  j=3 


+ h(0i) 


^i(sii  — X "*■  ^12 

j=3 


X ^lj®j2 

j=3 


Thus, 


h'(^l)  _ ^1(^11  - Ej=3  + S12  - Sj=3  ^ljQj2 

h(0l)  (sil  - Ej=3  SljOjl)  + 2^i(si2  - Ej=3  + S22  - Ej=3  ^2jOj2 


77 


or  equivalently 


\ogh{9i)  = 


r r r 

9l{sn  — + 201  (si2  — 51  + «22  — 5Z  ^2j«j2 

j=3  j=3  j=3 


This  leads  to 


h(0i) 


r r r 

+ 20i(Si2  - ^lj%'2)  + «22  - 5Z  ^2j%2 

j=3  j=3  i=3 


Cll^i  + 2Ci20iSi2  + C22 


1 

2 


Proof  of  Theorem  4.2  Recall  that 

P2  = 92h{9\)  , 

~ ~ 02^(0i)(oji0i  + 0,32) 


— 0j  ■"  /52(oji0i  + 0^2)  ) i — 3, ...,  r 
a = 9r+i 


Then, 


01  = 
02  = 
0,  == 

0r+l  — 


01  , 

J2_ 

h(0i)  ’ 

+ ^2(oji0i  + 0^2)  ! i = 3, ...,  r 


Hence,  the  Jacobian  of  transformation  is  h ^(0i)  = {cn9l  + 2ci20i  + 022)2.  Now, 


7t(01,/32,  OC  cr  “(cii0i  + 2Ci201  + C22)  ^ . 


Thus  the  likelihood  function  is 
L(0i,^2,-,/5r,cr)  (X  (7“"exp 


^ ft  ' 

“ ” /^20ia;ii  - p2Xi2  - 51 

j=3 


t=l 


78 


a "exp 


[-i^{ 


T vT 


ssE  + ^:x 


where  ^3’  = ■ Hence  the  joint  posterior  of  ^2, /5r 

and  a is  given  by 


Write 


(n+a)  g^p 

1 

2^2 

\^SSE+I3'^X'^XI3, 

'Sll 

S12 

Sl3  • 

• Sir 

S21 

S22 

S23  • 

S2r 

X^X  = 

S31 

S32 

S33  • 

• *3r 

Sri 

Sr2 

^r3 

••  Srr 

Since 


C = 


Cll 

C12 

.C2I 

C22. 

^11 

5i2 

’5i3  • 

■ 5ir 

.521 

S22 

.523  ■ 

S2r  . 

S33  • • • S3r 


®3r 


Si3  S23 


^Ir  ^2r . 


integrating  with  respect  to  ^3, ...,  » arid  using  the  properties  of  multivariate  normal 

distribution,  one  gets 


A»ufh.<r\y)  cx 


X exp  ^|s'5£^  + Cii(/32^1  — ^1)^  + 2Ci2(^2^1  ~ ^i)(/52  ^2)  + <^22(^2  P2)  }] 


79 


Next  write 


Cii(/^2^i  ~ + 2ci2(^2^i  ~ 0i){P2  — P2)  + 022(^2  — ^2)^ 


0^Q{9i)  — 2^2[9\{cix^i  + C12P2)  + {Cl201  + C22h)]  + Cll/?!  + 2Ci2^i/?2  + C22P'2 


Q{ei)[^2 


9\{ci\^l  + C12P2)  + {ci2pl  + C22P2) 


+Cll^l  + 2Ci2^i^2  + ^22/52 
But  on  simplification, 

CnPi  + 2ci2/5i/32  + C22P2  ~ 


Q{ei)  J 

+ C12P2)  + (ci2/?l  + <^22^2)} 


Q{0i) 


|^l(cii^l  + C12P2)  + {ci201  + ^22^2)} 


Q{di) 


Q ^(^l)j(Cll^i  + 2Ci2A^2  + C22^2)(<^11^1  + 2Ci2^1  + C22) 


^l(cil^l  + C12P2)  + Cl201  + <^22/^2]  I 


Hence,  integrating  with  respect  to  /?2  , 


I ^ -fn+al+r-l^//i  N ^ [00771  , 1^*1  (^2^1  " ) 

7t(0i,  cr|y)  oc  cr  Q(0i)  exp  ^ - —^SSE  + 1 


Finally,  integrating  with  respect  to  a , 


7.,  . ^-17.  joo^  , \c\im-Ay\ 

7r(^i|y)oc(3  {9i)i^SSE+ 1 


2 n+g  — r + l 


80 


Now  observe  that  since  n + a > r , 

n{9,\y)<KQ-^{ei)  , 

where  K{>  0)  is  a generic  constant  depending  on  y,  n,  r , and  a but  not  on  9i. 
Since, 

Q-'(»i)  = {cn[»i  + ^]'  + C22-^} 
and  C is  positive  definite  so  that 

c22-^>0,  Q-‘(«i)<cr, '[».  + -]■'  ■ 

Cii  >■  Cii-1 

Hence,  from  the  property  of  the  Cauchy  distribution,  the  finiteness  of  '^{9i\y)d9i 
follows. 


Proof  of  Theorem  4.3  Since  6>2  = ^ , proceeding  as  in  the  previous  case,  inte- 
grating with  respect  to  ^3,  one  gets 


X exp 


■ - ^{SSE  + CniMl  - A)"  + - P2)  + C22(/S2  - ^2? 


First,  integrating  with  respect  to  a 


T^{9l,^2\y)  OC  \P2\Q 

X |55F;  + Cn(^2^i  - hy  + 2ci2(/320i  - Pi){^2  ~ P2)  + 022(^2  ~ ^2)^] 


n+g— r+1 


As  before,  we  simplify 


Cii(/52^1  — + 2Ci2(/32^1  — dl){^2  ~ ^2)  + ^22(^2  ~ ^2)' 


= Q(^i){^2- 


9l{ciipi  + C12P2)  + {c\2^\  -H  C22P2)  |C'|(/^2^1 

Q{9i)  I Q(9i) 


81 


Now,  find 


.2  , \c\Wi-PiY\- 


2 n+g-r+l 


d^2 


^2[sse^ 

I>{ 


SSE  + 


, where  lj{6i)  = 

Q{0i) 

\C\02O,-PiY 


^ijcnPi  + C12Y2)  + C12A  + C22Y2 
Q{0,) 

n+q  — T’  + l 

+ Q{d\)  j^/?2  ~ ^(^1)]  I d^2 


Q{ei) 


n+g— r+l 

+ Q{0l)  [/^2  ~ ^(^1)]  I d^2 


Let 


= QY^i)[Y2  - 


Then 


1st  Term 


^ 1 yoo 

ohiBAJ- 


Q^{9i)  J-QhsiMSi) 


|55£;  + 


n+Q  — r + l 
2 


dz 


+ 00 


roo 

iOi)  1 

j-01 


dz 


{sS25+1^1&5^  + z^} 


n+g— r+l 
2 


\c\02e,-p,Y\-^  /•-  dz 


n+g— r+3 


+ -(«.){S5E+ 


,}-  , y 


, . n+g  — r + l 

A (1  + 2:2)  ^ 2 ^ 


where 


A^QHdiMOi) 


SSE  + 


\c\im-PiY 

Q{0i) 


82 


Similarly, 
2nd  Term 


I 1/ 

Q^(0i)2l 


SSE  + 


\C\020i-M^ 

Q{0i) 


n+g  — r+3 

2 


\c\im-A) 


2 ^ -"+°- 


Q{9 


i)  / Ja 


dz 


A i-x  , 9\  "+°-r+i 

A (1  + Z^)  2 


Thus,  the  marginal  posterior  pdf  of  9i  is  given  by 


_J_/ 


SSE  + 1^1  ^ + Q(9i)u^(9] 


+ 


2u;(0i)| 


SSE  + 


Q(0i) 


Q(0i) 


y ■ I. 


ri+g  — rd-3 


dz 


/A  ON  + l 

0 (1+Z^)  2 


In  order  to  see  the  propriety  of  the  posterior,  it  is  more  convenient  to  look  at 
Tr{/3,a\y).  This  is  given  by 

7r(/3,(r|y)  OC  <7”^”'*'“^  |^2l(Cn^l  + 2Ci2/3l^2  + C22;^2)  ^ 


X exp 


1 


SSE  + (/3  - ^fX^X 


(/3-3)} 


n+g— 1 


Now,  integrating  out  with  respect  to  Ps,  ...,pr,cr,  one  gets 


'^{Pl,P2\y)  OC  1/32 1 + 2Ci2^i/52  + 022^1)  ^ 

n+g— r+1 

X i^SSE  + CniPi  - Pif  + 2ci2(/3i  - ^)(^2  - ^2)  + C22W2  ~ S2?] 

Recognizing  the  above  pdf  is  proportional  to  £'^l/32l|t/)  from  a bivariate  t- 
distribution  with  location  parameters  and  P2  1 scale  matrix  ^[SSE]C  ^ and 
degrees  of  freedom  ^ the  propriety  of  the  posterior  follows  when  n+a  > r+3. 


CHAPTER  5 

SUMMARY  AND  FUTURE  RESEARCH 
5.1  Summary 

This  dissertation  develops  “vague”  or  “noninformative”  priors  for  several  classical 
problems  of  statistical  inference.  In  particular,  we  revisit  the  celebrated  Behrens- 
Fisher  problem,  common  mean  estimation  problem  and  problems  involving  certain 
regression  ratios  in  multiple  regression  models.  The  latter  includes  as  special  cases 
(i)  the  Fieller-Creasy  problem,  (ii)  the  calibration  problem,  (iii)  slope-ratio  assays, 
and  (iv)  parallel-line  assays. 

For  the  Behrens-Fisher  problem,  we  developed  several  reference  priors,  as  well  as 
probability  matching  priors.  Jeffreys’  independent  prior  turned  out  to  be  the  one-at- 
a-time  reference  prior,  while  a new  prior  was  found  which  turned  out  to  be  a second 
order  probability  matching  prior,  HPD  matching  prior  as  well  as  a matching  prior 
via  inversion  of  conditional  likelihood  test  statistics.  These  priors  were  all  superior 
to  Jeffreys’  general  rule  prior  which  is  proportional  to  the  positive  square  root  of  the 
determinant  of  the  Fisher  information  matrix. 

For  the  common  mean  estimation  problem,  we  found  reference  priors,  also  we 
found  a prior  which  was  second  order  matching,  HPD  matching  as  well  matching  via 
inversion  of  conditional  likelihood  test  statistics.  This  prior  produces  credible  inter- 
vals which  have  very  good  frequentist  coverage  probabilities  and  is  nearly  equivalent 
to  the  optimal  frequentist  procedure  in  every  situation  considered.  Incidentally,  in 
this  example,  there  does  not  exist  any  single  “optimal”  frequentist  interval. 

Chapter  4 developed  priors  suitable  for  inference  involving  ratios  of  regression 
coefficients.  In  the  process,  for  the  specific  examples  mentioned  earlier,  some  of  the 


83 


84 


previously  proposed  priors  were  found  as  special  cases,  while  some  new  priors  were 
proposed. 


5.2  Future  Research 

We  want  to  consider  some  multivariate  extensions  of  the  proposed  methods.  While 
the  quantile  matching  priors  require  real  valued  parameter  of  interest,  the  same  is  not 
required  of  HPD  matching  priors  or  matching  via  inversion  of  conditional  likelihood 
ratio  test  statistics.  Behrens-Fisher  problem  as  well  as  the  multivariate  common  mean 
estimation  problem  and  develop  the  reference  priors  as  well  matching  priors  for  the 
problem. 


85 


REFERENCES 

Aspin,A.A.(1948).  An  estimation  and  further  development  of  a formula  arising  in 
the  problem  of  comparing  two  mean  values.  Biometrika,  35,  88-96. 

Bartlett, M.S. (1936).  The  information  available  in  small  samples.  Proc.  Cambridge 
Philos. Soc.,  32,  560-566. 

Behrens, B.y .{1929) .Landwirtch.Jb.,  68,  807-837. 

Berger,J.O.,  and  Bernardo,J.M.(1989).  Estimating  a product  of  means:  Bayesian 
analysis  with  reference  priors.  JASA,  84,  200-207. 

Berger,J.O.,  and  Bernardo,J.M.(1992a).  On  the  development  of  reference  priors 
(with  discussion).  In  J.M. Bernardo,  J.O. Berger,  A.P.Dawid  & A.F.M.Smith, 
eds,  Bayesian  Statistics  4,  35-60,  Oxford  University,  Press 

Berger,J.O.,  and  Bernardo,J.M.(1992b).  Ordered  group  reference  priors  with  appli- 
cation to  the  multinomial  problem  Biometrika,  79,  25-37. 

Berger,J.O.,  Liseo,B.,  and  Wolpert,R.(1996).  Integrated  likelihood  methods  for  elim- 
inating nuisance  parameters.  Tech. Report  #96-7C  B,  Dept. of  Stat.,  Purdue 
University. 

Bernardo, J.M. (1977).  Inference  about  the  ratio  of  normalmeans:  A Bayesian  ap- 
proach to  the  Fieller-Creasy  problem.  In  Recent  Development  in  Statistics, 
J.R.Barra  et  al.  345-350.  Amsterdam,North-Holland. 

Bernardo,J.M.(1979).  Reference  posterior  distributions  for  Bayesian  inference.  JRSS 
B,  41,  113-147. 

Bhattacharya,C.G.(1980).  Estimation  of  common  mean  and  recovery  of  interblock 
information.  Ann.  Statist.,  8,  205-211. 

Brown, L.D.,  and  Cohen, A.  (1974).  Point  and  confidence  estimation  of  a common 
mean  and  recovery  of  inter-block  information.  Ann.  Statistics,  2,  963-976.-69. 

Brown,P.J.(1982).  Multivariate  calibration  (with  discussion).  JRSSB,  44,  287-321. 


86 


Buehler,R.J.(1959).  Some  validity  criteria  for  statistical  inference.  Ann.  Math. 
Statistics,  30,  845-863. 

Buonaccorsi,J.R,  and  Gatsonis,C.A.(1988).  Bayesian  inference  for  ratios  of  coeffi- 
cients in  a linear  model.  Biometrics,  44,  87-101. 

Cohen,A.,  and  Sackrowitz,H.B.(1974).  On  estimating  of  the  common  mean  of  two 
populations.  Ann.  Statistics,  2,  1274-1282. 

Cox,D.R.,  and  Hinkley,  D.V.(1974).  Theoretical  Statistics.  Chapman  & Hall,  New 
York. 

Cox,D.R.,  and  Reid,N.(1987).  Parameter  orthogonality  and  approximate  conditional 
inference.  JRSS  B,  49,  1-39. 

Creasy,M.A.(1954).  Limits  for  the  ratio  of  the  means.  JRSS  B,  16,  186-194. 

Datta,G.S.(1996).  On  priors  providing  frequentist  validity  of  Bayesian  inference  for 
multiple  parameter  functions.  Biometrika,  83,  287-298. 

Datta,G.S.,  and  Ghosh,J.K(1995).  On  priors  providing  frequentist  validity  of  Bayesian 
inference.  Biometrika,  82,  37-45. 

Datta,G.S.,  and  Ghosh,M.(1995).  Some  remarks  on  noninformative  priors.  JASA, 
90,  1357-1363. 

Datta,G.S.,  and  Ghosh,M.(1996).  On  the  invariance  of  noninformative  priors.  Ann. 
Stat,  24,  141-159. 

Dharmadhikari,S.,  and  Joag-dev,  K.(1988).  Unimodality,  Convexity  and  Applica- 
tions. Academic  Press,  New  York. 

DiCiccio,T.J.,  and  Stern, S.E. (1994).  Frequentist  and  Bayesian  Bartlett  correction 
of  test  statistics  based  on  adjusted  profile  likelihood.  JRSS  B,  56,  397-408. 

Eberhardt,K.R.,  Reeve,C.P.,  and  Spiegelman,C.H.(1989).  A minimax  approach  to 
combining  means,  with  practical  examples.  Chemometrics  and  Intelligent  Lab- 
oratory Systems,  5,  129-148. 


87 


Fairweather,W.R.(1972).  A method  of  obtaining  an  exact  confidence  interval  for  the 
common  mean  of  several  normal  populations.  Applied  Statistics,  21,  229-233. 

Filler, E.C. (1954).  Some  problems  in  interval  estimation.  JRSSB,  16,  175-185. 

Finney,D.J.(1978)  Statistical  Method  in  Biological  Assay.  Charles  Griffin  & Com- 
pany: London. 

Fisher, R.A. (1932).  Statistical  Methods  for  Research  Workers.  Oliver  & Boyd:  Lon- 
don. 

Fisher,R.A.(1935).  The  fiducial  argument  in  statistical  inference.  Ann.Eugenics, 
11,  141-172. 

Fisher, R.A. (1956).  On  a test  of  significance  in  Pearson’s  Biometrika  Tables(No.ll). 
JRSS  B,  18,  56-60. 

Garvan,C.W.,  and  Ghosh,M.(1997).  Noninformative  priors  for  dispersion  models. 
Biometrika,  84,4,  976-982. 

George, E. 0.(1977).  Combining  independent  one-sided  and  two-side  statistical  tests- 
some  theory  and  applications.  Unpublished  Ph.D.  Dissertation,  University  of 
Rochester. 

Ghosh, J.K. (1994).  Higher  Order  Asymptotics.  NSF-CBMS  Regional  Conference 
Series  in  Probability  and  Statistics,  4,  Institute  of  Mathematical  Statistics. 

Ghosh,J.K.,  and  Mukerjee,R.(1991).  Characterization  of  priors  under  which  Bayesian 
and  frequentist  Bartlett  corrections  are  equivalent  in  the  multiparameter  case. 
J. Mult. Anal.,  38,  385-393. 

Ghosh, J.K.,  and  Mukerjee,R.(1992).  Bayesian  and  frequentist  Bartlett  corrections 
for  likelihood  ratio  and  conditional  likelihood  ratio  test.  JRSS  B,  54,  867-875. 

Ghosh,J.K.,  and  Mukerjee,R.(1995).  Frequentist  validity  of  highest  posterior  density 
regions  in  the  presence  of  nuisance  parameters.  Statistics  & Decisions,  13,  131- 


139. 


88 


Ghosh, M.,  Carkin,B.P.,  and  Srivastava,M.S.(1995).  Probability  matching  priors  for 
linear  calibration.  Test,  4,  No. 2,  333-357. 

Ghosh, M.,  and  Yang,M.M.(1996).  Noninformative  priors  for  the  two  sample  normal 
problem.  Test,  5,  145-157. 

Ghosh,M.,  and  Yin,Min.(2000).  Empirical  Bayes  and  Likelihood  Inference.  Springer 
Verlag:  New  York. 

Gleser,L.J.,  and  Hwang,J.T.(1987).  The  non-existence  of  100  (1  - a%)  confidence 
sets  of  finite  expected  diameters  in  error-in-variable  and  related  models.  Ann. 
Stat,  15,  1351-1362. 

Graybill,F.A.,  and  Deal,R.B.(1959).  Combining  unbiased  estimators.  Biometrics, 
15,  543-550. 

Hedges,L.V.,  and  Oklin,I.(1985).  Statistical  Methods  for  Meta-Analysis.  Academic 
Press:  New  York. 

Hoadley,B.(1970).  A Bayesian  lookat  inverse  linear  regression.  JASA,  65,  356-369. 

Hill,B.M.(1981).  Discussion  of  “A  Bayesian  analysis  of  the  linear  calibration  prob- 
lem”. Technometrics,  23,  335-338. 

Hunter,W.G.,  and  Lamboy,W.F.(1981).  A Bayesian  analysis  of  the  linear  calibration 
problem.  Technometrics,  23,  323-350. 

Jeffreys,H.(1961).  Theory  of  probability.  Oxford  University  Press:  London. 

Jordan,S.M.,  and  Krishnamoorthy,K.(1996).  Exact  confidence  intervals  for  the  com- 
mon mean  of  several  normal  populations.  Biometrics,  52,  77-86. 

j0rgensen,B.(1992).  The  Theory  of  Exponential  Dispersion  Models  and  Analysis  of 
Deviance.  Rio  de  Janeiro  : Conselho  Nacional  de  Desenvolvimento  Cientifico  e 
Tecnnologico,  Instituto  de  Matematica  Pura  e Aplicada. 

Kappenman,R.F,  Geisser,S.,  and  Antle,C.F.(1970).  Bayesian  and  fiducial  solutions 
to  the  Fieller-Creasy  problem.  Sankhya  B,  32,  331-340. 


89 


Kass,  R.E.,  and  Wasserman,  L.(1996).  Formal  rules  for  selecting  prior  distributions: 
a review  and  annotated  bibliography.  JASA,  91,  1343-1370. 

Kendall, M.G.,  and  Stuart,A.(1967).  The  advanced  Theory  of  Statistics,  2,:  Inference 
and  Relationship,  2nd  ed.  Charles  Griffin:  London. 

Kharti,C.G.,  and  Shah,K.R.(1974).  Estimation  of  location  parameters  from  two 
linear  models  under  normality.  Comm.  Statist.  Theory  and  Methods,  3,  647- 
663. 

Kim,RT.,  Carter,E.M.,  and  Hubert,J.J.(1991)  . Estimating  relative  potency  using 
prior  information.  Biometrics,  47,  295-301. 

Krutchkoff,R.G.(1967).  Classical  and  inverse  regression  methods  of  calibration. 
Technometrics,  9,  429-439. 

Kubokawa,T.(1990).  Minimax  estimation  of  common  coefficient  of  several  regression 
models  under  quadratic  loss.  J.  Statist.  Plan,  and  Inf.,  24,  337-345 

Laplace, R (1812).  Theorie  Analytique  des  Probabilities.  Courcier:  Paris. 

Lawless,J.F.(1981).  Discussion  of  “A  Bayesian  analysis  of  the  linear  calibration 
problem”.  Technometrics,  23,  334-335. 

Lee,A.,  and  Gurland,J.(1975).  Size  and  power  of  tests  for  equality  of  means  of  two 
normal  populations  with  unequal  variances.  JASA,  70,  933-947. 

Lindley,D.V.(1958).  Fiducial  distributions  and  Bayes’  theorem.  JRSS  B,  20,  102- 
107. 

Liseo,B.(1993).  Elimination  of  nuisance  parameters  with  reference  priors.  Biometrika, 
80,  295-304. 

Meier, R(1953).  Variance  of  weighted  mean.  Biometrics,  9,  59-73. 

Mendoza,M.(1988).  Inference  about  the  ratio  of  linear  combinations  of  the  coef- 
ficients in  a multiple  regression  model.  Bayesian  Statistics  3 (J.M. Bernardo, 
M.H.DeGroot,  D.V.Lindley  and  A.F.M. Smith,  eds).  Oxford  University  Press: 


London.  705-711. 


90 


Mendoza, M.  (1990).  A Bayesian  analysis  of  the  slope  ratio  bioassay.  Biometrics,  46, 
1059-1069. 

Mendoza, M. (1996).  A note  on  the  confidence  probabilities  of  reference  prior  for  the 
calibration  model.  Preprint. 

Mukerjee,R.,  and  Dey,D.K.(1993).  Frequentist  validity  of  posterior  quantiles  in  the 
presence  of  a nuisance  parameter:  higher  order  asymptotics.  Biometrika,  80, 
499-505. 

Mukerjee,R.,  and  Ghosh,M.(1997).  Second  order  probability  matching  priors. 
Biometrika  , 84,  970-975. 

Neyman,J.,  and  Scott,E.L.(1948).  Consistent  estimates  based  on  partially  consistent 
observations.  Econometrika,  16,  1-32. 

Nicolaou,A.(1993).  Bayesian  intervals  with  good  frequentist  behavior  in  the  presence 
of  nuisance  parameters.  JRSS  B,  27,  9-16. 

Peers,H.W.(1965).  On  confidence  sets  and  Bayesian  probability  points  in  the  case 
of  several  parameters.  JRSS  B,  27,  9-16. 

Philippe, A.,  and  Robert, C. (1998).  A note  on  the  confidence  properties  of  reference 
priors  for  the  calibration  model.  Test,  7,  147-160. 

Reid,  N.(1995).  Likelihood  and  Bayesian  approximation  methods.  In  Bayesian 
Statistics  5.  Eds.  J.M.  Bernardo,  J.  Berger,  A.P.  Dawid  and  A.F.M.Smith. 
Oxford  University  Press,  pp  351-368. 

Robinson,G.K.(1976).  Properties  of  Student’s  t and  of  the  Behrens-Fisher  solution 
to  the  two  means  problem.  Annals  of  Statistics,  4,  963-971. 

Robinson, G.K. (1982).  Behrens-Fisher  problem.  In  Encyclopedia  of  Statistical  Sci- 
ences, VI.  Eds.  N.L.  Johnson,  S.  Kotz  and  C.B.  Read.  Wiley,  New  York,  pp 
205-209. 

Scheffe,H.(1970).  Practical  solutions  of  the  Behrens-Fisher  problem.  JASA,  65, 


1501-1508. 


91 


Sendra,M.(1982).  Distribucion  final  de  referenda  para  el  problema  de  Fieller-Creasy. 
Trabajos  de  Estadistica  Investigation  Operation,  33,  55-72. 

Severini,T.(1991).  On  the  relationship  between  Bayesian  and  non-Bayesian  interval 
estimates.  JRSSB,  53,  611-618. 

Shinozaki,N.(1978).  A note  on  estimating  the  common  mean  of  k normal  distribu- 
tions and  the  Stein  problem.  Comm.  Statist.  Theory  Methods,  7,  1421-1432. 

Sinha,B.K.(1985).  Unbiased  estimation  of  the  variance  of  the  Graybill-Deal  es- 
timator of  the  common  mean  of  several  normal  populations.  Canadian  J.of 
Statistics,  13,  243-247. 

Sinha,B.K.(1998).  Statistical  Meta- Analysis  with  Applications. 

Sinha,B.K.,  and  Mouqadem,0.(1982).  Estimation  of  the  common  mean  of  two  uni- 
variate normal  populations.  Comm.  Statist.  Theory  and  Methods,  11,  1603- 
1614. 

Snedecor,G.W.(1950).  The  statistical  part  of  the  scientific  method.  Annals  of  the 
New  York  Academy  of  Science,  52,  742-749. 

Stein,C.(1985).  On  coverage  probability  of  confidence  sets  based  on  a prior  dis- 
tribution. In  Sequential  Methods  in  Statistics,  16,  485-514.  Banach  Center 
Publications:  Warsaw. 

Stephens,D.A.,  and  Smith,A.F.M.(1992).  Sampling-resampling  techniques  for  the 
computation  of  posterior  densities  in  normal  means  problem.  Test,  1,  1-18. 

Stouffer,S.A.,  Suchman,E.A.,  DeVinney,L.C.  Star,S.A.  and  Williams,  R.M.Jr.  (1949). 
The  American  Soldier,  1 Princeton,  NJ:  Princeton  University  Press. 

Sun,D.,  and  Ye, K. (1996).  Frequentist  validity  of  posterior  quantiles  for  a two  pa- 
rameter exponential  family.  Biometrika,  83,  55-65. 

Tibshirani,R.(1989).  Noninformative  priors  for  one  parameter  of  many.  Biometrika, 
76,  604-608. 

Tippett,L.H.(1931).  The  Methods  of  Statistics.  London:  Williams  & Norgate. 


92 


Welch, B.L. (1947).  The  generalization  of  Student’s  problem  when  several  different 
population  variances  are  involved.  Biometrika,  34,  28-35. 

Welch,B.L.,  and  Peers, H.W. (1963).  On  formula  for  confidence  points  based  on  inte- 
grals of  weighted  likelihoods.  JRSS  B,  25,  318-329. 

Wood,E.C.(1946).  The  theory  of  certain  analytical  procedures,  with  particular  ref- 
erence to  microbiological  assays.  Analyst,  71,  1-14. 

Yin,M.,  and  Ghosh,M.(1997).  A note  on  the  probability  difference  between  matching 
priors  based  on  posterior  quantiles  and  on  inversion  of  conditional  likelihood 
ratio  statistics.  Calcutta  Statistical  Association  Bulletin,  47,  185-186. 

Yu,P.L.,  Sun,Y.,  and  Sinha,B.K.(1999).  On  exact  confidence  intervals  for  the  com- 
mon mean  of  several  normal  populations.  J.  Statist.  Plan,  and  Inf.  To  Appear 
2000. 

Zacks,S.(1966).  Unbiased  estimation  of  the  common  mean.  JASA,  61,  467-476. 

Zacks,S.(1970).  Bayes  and  fiducial  euivariant  estimators  of  the  common  means  of 
two  populations.  Ann. Math.  Statistics,  41,  59-69. 


BIOGRAPHICAL  SKETCH 

Yeong-Hwa  Kim  was  born  on  April  15,  1964,  in  Seoul,  Korea.  In  1990,  he  earned  a 
Bachelor  of  Economics  in  Statistics  degree  from  Chung- Ang  University,  Seoul,  Korea, 
and  two  years  later  earned  a Master  of  Economics  in  Statistics  degree  in  February, 
1992,  from  the  same  university.  He  entered  the  PhD  program  in  Statistics  at  the  same 
university  in  March  1992.  With  highly  motivated  reasons,  however,  he  quit  his  PhD 
work  in  Korea  on  August,  1994.  Finally,  he  entered  the  PhD  program  in  Statistics 
at  the  University  of  Florida  in  August,  1995.  He  served  part-time  as  an  instructor  in 
many  universities  in  Korea  from  1992  until  1995.  In  addition  to  pursuing  his  Ph.D. 
in  Statistics  from  the  University  of  Florida,  he  has  served  as  a teaching  assistant  of 
the  Department  of  Statistics  at  UF. 

He  got  married  in  1990  and  is  the  father  of  a wonderful  daughter  and  a son.  After 
graduation,  the  author  will  begin  his  next  adventure  as  a researcher  at  the  Research 
Institute  of  Finance  of  Samsung  Life  Insurance  Co.  in  Seoul,  Korea. 


93 


NONINFORMATIVE  PRIORS  AND  BAYESIAN  INFERENCE 

Yeong-Hwa  Kim 
(352)  338-0858 
Department  of  Statistics 
Chair;  Dr.  Malay  Ghosh 
Degree:  Doctor  of  Philosophy 
Graduation  Date:  August  2000 

In  recent  years,  statistics  has  been  widely  used  in  most  academic  and  industrial 
fields.  In  addition  to  the  information  obtained  from  a sample  Bayesian  considers  prior 
information  of  the  parameters.  Naturally,  the  first  step  of  Bayesian  inference  is  to 
find  an  appropriate  prior  distribution.  Bayesian  analysis  is  performed  by  combining 
the  prior  information  and  the  sample  information  into  what  is  called  the  posterior 
distribution.  Just  as  the  prior  distribution  reflects  beliefs  prior  to  the  experimenta- 
tion, so  the  posterior  reflects  the  updated  beliefs  after  observing  the  sample.  In  other 
words,  the  posterior  distribution  combines  the  prior  information  about  the  parame- 
ter of  interest  contained  in  the  sample  to  give  a composite  picture  of  the  final  beliefs 
about  the  parameter  of  interest.  Bayesian  techniques  have  found  wider  acceptance  in 
recent  years  in  the  theory  and  practice  of  statistics.  These  can  be  explained  by  the 
fact  that  even  in  the  presence  of  little  or  vague  prior  information,  Bayesian  techniques 
can  often  be  used  successfully  by  employing  the  so-called  “default”  priors.  Thus,  not 
surprisingly,  over  the  years,  a wide  range  of  noninformative  priors  has  been  proposed 
and  studied.  This  dissertaion  focused  on  Bayesian  analysis  of  some  classical  problems 
of  statistical  inference.  In  particular,  we  have  derived  several  “default”  priors  such 
as  Jeffreys’  prior,  reference  priors,  and  probability  matching  priors  for  the  Behrens- 
Fisher  problem,  for  the  common  mean  of  several  normal  populations,  and  for  the 
ratios  of  regression  coefficients  in  linear  models.  The  latter  includes  as  special  cases, 
the  linear  calibration  problem,  the  Fieller-Creasy  problem,  slope-ratio  bioassay,  and 
parallel-line  bioassay. 


I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to  accept- 
able standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality, 
as  a dissertation  for  the  degree  of  Doctor  of  Philosophy. 


Malay  Ghosh,  Chairman 
Distinguished  Professor  of  Statistics 


I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to  accept- 
able standards  of  scholarly  presentation  and  is  fully  adequate,  in  scp^  and  quality, 
as  a dissertation  for  the  degree  of  Doctor  of  P^nlos^hy.  / Jj 

Ramon  Littell 
Professor  of  Statistics 


I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to  accept- 
able standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality, 
as  a dissertation  for  the  degree  of  Doctor  of  Philosophy. 


Professor  of  Statistics 


I certify  that  I have  read  this  study  and  that  m my  opinion  it  conforms  to  accept- 
able standards  of  scholarly  presentation  and  is ^lly  adequate,  in  scope  and  quality, 

.^ssociate  Professor  of  Statistics 

I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to  accept- 
able standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality, 
as  a dissertation  for  the  degree  of  Doctor  of  Philosophy. 



Beverly  Brechner 

Professor  of  Mathematics 


This  dissertation  was  submitted  to  the  Graduate  Faculty  of  the  Department  of 
Statistics  in  the  College  of  Liberal  Arts  and  Sciences  and  to  the  Graduate  School  and 
was  accepted  as  partial  fulfillment  of  the  requirements  for  the  degree  of  Doctor  of 
Philosophy. 

August  2000  — 

Dean,  Graduate  School 


